Rekadohttps://elephly.net/feeds/tags/planet-fsfe-en.xmlTag: planet-fsfe-en2023-09-24T23:33:34ZThoughts on GNU and Richard Stallmanhttps://elephly.net/posts/2019-09-17-rms-gnu.htmlRicardo Wurmusrekado+web@elephly.net2019-09-17T14:05:00Z<p>Richard Stallman has <a href="https://www.fsf.org/news/richard-m-stallman-resigns">resigned as president and from the board of
directors of the Free Software
Foundation</a>. I
welcome this decision.</p><p>As a co-maintainer of GNU packages (including Guix, the Guix Workflow
Language, the Guile Picture Language, etc), and as a contributor to
various other GNU software, I would like to state that while I'm
grateful for Richard Stallman's founding of the GNU project and his
past contributions to GNU, it would be wrong to continue to remain
silent on the negative effects his behaviour and words have had over
the past years. His actions have hurt people and alienated them from
the free software movement.</p><p>When I joined GNU I used to think of Richard as just a bit of a quirky
person with odd habits, with a passion for nitpicking and clear
language, but also with a vision of freeing people from oppression at
the hands of a boring dystopia mediated by computers. Good
intentions, however, aren't enough. Richard's actions over the past
years sadly have been detrimental to achieving the vision that he
outlined in the <a href="https://www.gnu.org/gnu/manifesto.html">GNU
Manifesto</a>, to benefit all
computer users.</p><p>GNU's not Unix, but Richard ain't GNU either (RAGE?). GNU is bigger
than any one person, even its founder. I'm still convinced that GNU
has an important role to play towards providing a harmonized,
trustworthy, freedom-respecting operating system environment that
benefits all computer users. I call upon other maintainers of GNU
software to embrace the responsibilities that working on a social
project such as GNU brings. The GNU Manifesto states that "GNU serves
as an example to inspire and a banner to rally others to join us in
sharing". Let us do that by welcoming people of all backgrounds into
GNU and by working hard to provide a healthy environment for fruitful
collaboration.</p>A simple picture language for GNU Guilehttps://elephly.net/posts/2018-09-01-guile-picture-language.htmlRicardo Wurmusrekado+web@elephly.net2018-09-01T23:00:00Z<p>One thing that I really love about Racket is its <a href="https://docs.racket-lang.org/pict/">picture
language</a>, which allows you to
play with geometric shapes in an interactive session in Dr Racket.
The shapes are displayed right there in the REPL, just like numbers or
strings. Instead of writing a programme that prints "hello world" or
that computes the Fibonacci numbers, one could write a programme that
composes differently rotated, coloured shapes and prints those
instead.</p><p>I use <a href="https://gnu.org/software/guile">GNU Guile</a> for my own projects,
and sadly we don't have an equivalent of Racket's picture language or
the Dr Racket editor environment. So I made something: a <a href="https://git.elephly.net/software/guile-picture-language.git">simple
picture language for GNU
Guile</a>.
It provides simple primitive procedures to generate shapes, to
manipulate them, and to compose them.</p><p>Download the single Guile module containing the implementation:</p><pre><code>mkdir ~/pict
wget https://elephly.net/downies/pict.scm</code></pre><p>To actually see these shapes as you play with them, you need to use a
graphical instance of <a href="https://gnu.org/software/emacs">GNU Emacs</a> with
<a href="http://www.nongnu.org/geiser/">Geiser</a>.</p><p>Start geiser in Emacs and load the module:</p><pre><code>M-x run-guile
(add-to-load-path (string-append (getenv "HOME") "/pict"))
,use (pict)</code></pre><p>Let’s play!</p><pre><code>(circle 100)</code></pre><p>If you see a pretty circle: hooray! Let’s play some more:</p><pre><code>(colorize (circle 100) "red")
(disk 80)
(rectangle 50 100)</code></pre><p>Let's compose and manipulate some shapes!</p><pre><code>,use (srfi srfi-1)
,use (srfi srfi-26)
(apply hc-append
(map (cut circle <>)
(iota 10 2 4)))
(apply cc-superimpose
(map (cut circle <>)
(iota 10 2 4)))
(apply hc-append
(map (cut rotate (rectangle 10 30) <>)
(iota 36 0 10)))
(apply cc-superimpose
(map (cut rotate (triangle 100 300) <>)
(iota 36 0 10)))</code></pre><p>There are many more procedures for primitive shapes and for
manipulations. Almost all procedures in pict.scm have docstrings, so
feel free to explore the code to find fun things to play with!</p><p>PS: I realize that it's silly to have a blog post about a picture
language without any pictures. Instead of thinking about this now,
get the module and make some pretty pictures yourself!</p>Using R with Guixhttps://elephly.net/posts/2017-03-24-r-with-guix.htmlRicardo Wurmusrekado+web@elephly.net2017-03-24T10:00:00Z<h2>Introducing the actors</h2><p>For the past few years I have been working on
<a href="https://gnu.org/software/guix">GNU Guix</a>, a functional
package manager. One of the most obvious benefits of a functional
package manager is that it allows you to install any number of
variants of a software package into a separate environment, without
polluting any other environment on your system. This feature makes a
lot of sense in the context of scientific computing where it may be
necessary to use different versions or variants of applications and
libraries for different projects.</p><p>Many programming languages come with their own package management
facilities, that some users rely on despite their obvious limitations.
In the case of GNU R the built-in <code>install.packages</code> procedure
makes it easy for users to quickly install packages from CRAN, and the
third-party <code>devtools</code> package extends this mechanism to
install software from other sources, such as a git repository.</p><h2>ABI incompatibilities</h2><p>Unfortunately, limitations in how binaries are executed and
linked on GNU+Linux systems make it hard for people to continue to use
the package installation facilities of the language when also using R
from Guix on a distribution of the GNU system other than GuixSD.
Packages that are installed through <code>install.packages</code> are
built on demand. Some of these packages provide bindings to other
libraries, which may be available at the system level. When these
bindings are built, R uses the compiler toolchain and the libraries the
system provides. All software in Guix, on the other hand, is
completely independent from any libraries the host system provides,
because that's a direct consequence of implementing functional package
management. As a result, binaries from Guix do not have binary
compatibility with binaries built using system tools and linked with
system libraries. In other words: due to the lack of a shared ABI
between Guix binaries and system binaries, packages built with the
system toolchain and linked with non-Guix libraries cannot be loaded
into a process of a Guix binary (and vice versa).</p><p>Of course, this is not always a problem, because not all R
packages provide bindings to other libraries; but the problem usually
strikes with more complicated packages where using Guix makes a lot of
sense as it covers the whole dependency graph.</p><p>Because of this nasty problem, which cannot be solved without a
redesign of compiler toolchains and file formats, I have been
recommending people to just use Guix for everything and avoid mixing
software installation methods. Guix comes with many R packages and
for those that it doesn't include it has an importer for the CRAN and
Bioconductor repositories, which makes it easy to create Guix package
expressions for R packages. While this is certainly valid advice, it
ignores the habits of long-time R users, who may be really attached to
<code>install.packages</code> or <code>devtools</code>.</p><h2>Schroedinger's Cake</h2><p>There is another way; you can have your cake and eat it too. The
problem arises from using the incompatible libraries and toolchain
provided by the operating system. So let's just <em>not</em> do this,
mmkay? As long as we can make R from Guix use libraries and the
compiler toolchain from Guix we should not have any of these
ABI problems when using <code>install.packages</code>.</p><p>Let's create an environment containing the current version of R,
the GCC toolchain, and the GNU Fortran compiler with Guix. We could
use <code>guix environment --ad-hoc</code> here, but it's better to use a
persistent profile.</p><pre><code>$ guix package -p /path/to/.guix-profile
-i r gcc-toolchain gfortran</code></pre><p>To "enter" the profile I recommend using a sub-shell like this:</p><pre><code>$ bash
$ source /path/to/.guix-profile/etc/profile
$ …
$ exit</code></pre><p>When inside the sub-shell we see that we use both the GCC
toolchain and R from Guix:</p><pre><code>$ which gcc
$ /gnu/store/…-profile/bin/gcc
$ which R
$ /gnu/store/…-profile/bin/R
</code></pre><p>Note that this is a <em>minimal</em> profile; it contains the GCC
toolchain with a linker that ensures that e.g. the GNU C library from
Guix is used at link time. It does not actually contain any of the
libraries you may need to build certain packages.</p><p>Take the R package "Cairo", which provides bindings to the Cairo
rendering libraries as an example. Trying to build this in this new
environment will fail, because the Cairo libraries are not found. To
privide the required libraries we exit the environment, install the
Guix packages providing the libraries and re-enter the environment.</p><pre><code>$ exit
$ guix package -p /path/to/.guix-profile -i cairo libxt
$ bash
$ source /path/to/.guix-profile/etc/profile
$ R
> install.packages("Cairo")
…
* DONE (Cairo)
> library(Cairo)
></code></pre><p>Yay! This should work for any R package with bindings to any
libraries that are in Guix. For this particular case you could have
installed the <code>r-cairo</code> package using Guix, of course.</p><h2>Potential problems and potential solutions</h2><p>What happens if the system provides the required header files and
libraries? Will the GCC toolchain from Guix use them? Yes. But
that's okay, because it won't be able to compile and link the binaries
anyway. When the files are provided by both Guix <em>and</em> the system the toolchain prefers the Guix stuff.</p><p>It is <em>possible</em> to prevent the R process and all its
children from ever seeing system libraries, but this requires the use
of containers, which are not available on somewhat older kernels that
are commonly used in scientific computing environments. Guix provides
support for containers, so if you use a modern Linux kernel on your
GNU system you can avoid some confusion by using either <code>guix
environment --container</code> or <code>guix container</code>. Check out
<a href="http://www.gnu.org/software/guix/manual/html_node/Invoking-guix-environment.html">the glorious manual</a>.</p><p>Another problem is that the packages you build manually do not
come with the benefits that Guix provides. This means, for example,
that these packages won't be bit-reproducible. If you want
bit-reproducible software environments: use Guix and don't look
back.</p><h2>Summary</h2><ul><li>Don't mix Guix with system things to avoid ABI conflicts.</li>
<li>If you use <code>install.packages</code> let R from Guix use
the GCC toolchain and libraries from Guix.</li>
<li>We do this by installing the toolchain and all libraries we
need into a separate Guix profile. R runs inside of that
environment.</li></ul><h2>Learn more!</h2><p>If you want to learn more about GNU Guix I recommend taking a
look at the excellent <a href="https://www.gnu.org/software/guix/">GNU Guix project page</a>, which offers links to talks, papers,
and the manual. Feel free to contact me if you want to learn
more about packaging scientific software for Guix. It is not
difficult and we all can benefit from joining efforts in adopting
this usable, dependable, hackable, and liberating platform for
scientific computing with free software.</p><p>The Guix community is very friendly, supportive, responsive and
welcoming. I encourage you to visit the project’s <a href="https://webchat.freenode.net?channels=#guix">IRC channel #guix
on Freenode</a>, where I go by the handle “rekado”.</p><p>Read <a href="/tags/guix.html">more posts
about GNU Guix here</a>.</p>Bootstrapping Haskell: part 1https://elephly.net/posts/2017-01-09-bootstrapping-haskell-part-1.htmlRicardo Wurmusrekado+web@elephly.net2017-01-09T00:00:00Z<div>
<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#org2d8c41c">1. A short survey of Haskell implementations</a></li>
<li><a href="#orgb07f4d3">2. An early bootstrapping idea</a></li>
<li><a href="#org9ee479e">3. Setting up the environment</a></li>
<li><a href="#org9e4c0ac">4. Building the runtime</a></li>
<li><a href="#org655c609">5. Building the prelude?</a></li>
<li><a href="#org360295f">6. Detour: cpphs and GreenCard</a></li>
<li><a href="#org9519691">7. Building the prelude!</a></li>
<li><a href="#orgace25de">8. Building hmake</a></li>
<li><a href="#org83aa24b">9. To be continued</a></li>
</ul>
</div>
</div>
<p>
Haskell is a formally specified language with potentially many alternative implementations, but in early 2017 the reality is that Haskell is whatever the Glasgow Haskell Compiler (GHC) implements. Unfortunately, to build GHC one needs a previous version of GHC. This is true for all public releases of GHC all the way back to version 0.29, which was released in 1996 and which implements Haskell 1.2. Some GHC releases include files containing generated ANSI C code, which require only a C compiler to build. For most purposes, generated code does not qualify as source code.
</p>
<p>
So I wondered: <i>is it possible to construct a procedure to build a modern release of GHC from source without depending on any generated code or pre-built binaries of an older variant of GHC?</i> The answer to this question depends on the answers to a number of related questions. One of them is: are there any alternative Haskell implementations that are still usable today and that can be built without GHC?
</p>
<div id="outline-container-org2d8c41c" class="outline-2">
<h2 id="org2d8c41c">A short survey of Haskell implementations</h2>
<div id="text-1" class="outline-text-2">
<p>
Although nowadays hardly anyone uses any other Haskell compiler but GHC in production there are some alternative Haskell implementations that were protected from bit rot and thus can still be built from source with today’s common toolchains.
</p>
<p>
One of the oldest implementations is <a href="http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/syntax/haskell/0.html">Yale Haskell</a>, a Haskell system embedded in Common Lisp. The last release of Yale Haskell was version 2.0.5 in the early 1990s.<sup><a id="fnr.1" href="#fn.1" class="footref">1</a></sup> Yale Haskell runs on top of CMU Common Lisp, Lucid Common Lisp, Allegro Common Lisp, or Harlequin LispWorks, but since I do not have access to any of these proprietary Common Lisp implementations, I ported the Yale Haskell system to GNU CLISP. The <a href="http://git.elephly.net/software/yale-haskell.git">code for the port is available here</a>. Yale Haskell is not a compiler, it can only be used as an interpreter.
</p>
<p>
Another Haskell interpreter with a more recent release is <a href="https://haskell.org/hugs">Hugs</a>. Hugs is written in C and implements almost all of the Haskell 98 standard. It also comes with a number of <a href="https://www.haskell.org/hugs/pages/users_guide/hugs-ghc.htm">useful language extensions</a> that GHC and other Haskell systems depend on. Unfortunately, it cannot deal with mutually recursive module dependencies, which is a feature that even the earliest versions of GHC rely on. This means that running a variant of GHC inside of Hugs is not going to work without major changes.
</p>
<p>
An alternative Haskell compiler that does not need to be built with GHC is <a href="https://www.haskell.org/nhc98">nhc98</a>. Its latest release was in 2010, which is much more recent than any of the other Haskell implementations mentioned so far. nhc98 is written in Haskell, so a Haskell compiler or interpreter is required to build it. Like GHC the release of nhc98 comes with files containing generated C code, but depending on them for a clean bootstrap is almost as bad as depending on a third-party binary. Sadly, nhc98 has another shortcoming: it is restricted to 32-bit machine architectures.
</p>
</div>
</div>
<div id="outline-container-orgb07f4d3" class="outline-2">
<h2 id="orgb07f4d3">An early bootstrapping idea</h2>
<div id="text-2" class="outline-text-2">
<p>
Since nhc98 is written in C (the runtime) and standard Haskell 98, we can run the Haskell parts of the compiler inside of a Haskell 98 interpreter. Luckily, we have an interpreter that fits the bill: Hugs! If we can interpret and run enough parts of nhc98 with Hugs we might be able to use nhc98 on Hugs to build a native version of nhc98 and related tools (such as cpphs, hmake, and cabal). Using the native compiler we can build a complete toolchain and just <i>maybe</i> that’s enough to build an early version of GHC. Once we have an early version of GHC we can go ahead and build later versions with ease. (Building GHC directly using nhc98 on Hugs might also work, but due to complexity in GHC modules it seems better to avoid depending on Hugs at runtime.)
</p>
<p>
At this point I have verified that (with minor modifications) nhc98 can indeed be run on top of Hugs and that (with enough care to pre-processing and dependency ordering) it can build a native library from the Haskell source files of the nhc98 prelude. It is not clear whether nhc98 would be capable of building a version of GHC and how close to a modern version GHC we can get with just nhc98. There are also problems with the nhc98 runtime on modern x86<sub>64</sub> systems (more on that at the end).
</p>
</div>
</div>
<div id="outline-container-org9ee479e" class="outline-2">
<h2 id="org9ee479e">Setting up the environment</h2>
<div id="text-3" class="outline-text-2">
<p>
Before we can start let’s prepare a suitable environment. Since nhc98 can only be used on 32-bit architectures we need a GCC toolchain for i686. With <a href="https://gnu.org/software/guix">GNU Guix</a> it’s easy to set up a temporary environment containing just the right tools: the GCC toolchain, make, and Hugs.
</p>
<div class="org-src-container">
<pre class="src src-sh">guix environment --system=i686-linux \
--ad-hoc gcc-toolchain@4 make hugs
</pre>
</div>
</div>
</div>
<div id="outline-container-org9e4c0ac" class="outline-2">
<h2 id="org9e4c0ac">Building the runtime</h2>
<div id="text-4" class="outline-text-2">
<p>
Now we can configure nhc98 and build the C runtime, which is needed to link the binary objects that nhc98 and GCC produce when compiling Haskell sources. Configuration is easy:
</p>
<div class="org-src-container">
<pre class="src src-sh">cd /path/to/nhc98-1.22
export NHCDIR=$PWD
./configure
</pre>
</div>
<p>
Next, we build the C runtime:
</p>
<div class="org-src-container">
<pre class="src src-sh">cd src/runtime
make
</pre>
</div>
<p>
This produces binary objects in an architecture-specific directory. In my case this is <code>targets/x86_64-Linux</code>.
</p>
</div>
</div>
<div id="outline-container-org655c609" class="outline-2">
<h2 id="org655c609">Building the prelude?</h2>
<div id="text-5" class="outline-text-2">
<p>
The standard library in Haskell is called the prelude. Most source files of nhc98 depend on the prelude in one way or another. Although Hugs comes with its own prelude it is of little use for our purposes as components of nhc98 must be linked with a static prelude library object. Hugs does not provide a suitable object that we could link to.
</p>
<p>
To build the prelude we need a Haskell compiler that has the same call interface as nhc98. Interestingly, much of the nhc98 compiler’s user interface is implemented as a shell script, which after extensive argument processing calls <code>nhc98comp</code> to translate Haskell source files into C code, and then runs GCC over the C files to create binary objects. Since we do not have <code>nhc98comp</code> at this point, we need to fake it with Hugs (more on that later).
</p>
</div>
</div>
<div id="outline-container-org360295f" class="outline-2">
<h2 id="org360295f">Detour: cpphs and GreenCard</h2>
<div id="text-6" class="outline-text-2">
<p>
Unfortunately, this is not the only problem. Some of the prelude’s source files require pre-processing by a tool called GreenCard, which generates boilerplate FFI code. Of course, GreenCard is written in Haskell. Since we cannot build a native GreenCard binary without the native nhc98 prelude library, we need to make GreenCard run on Hugs. Some of the GreenCard sources require pre-processing with <code>cpphs</code>. Luckily, that’s just a really simple Haskell script, so running it in Hugs is trivial. We will need <code>cpphs</code> later again, so it makes sense to write a script for it. Let’s call it <code>hugs-cpphs</code> and drop it in <code>${NHCDIR}</code>.
</p>
<div class="org-src-container">
<pre class="src src-sh">#!/bin/bash
runhugs ${NHCDIR}/src/cpphs/cpphs.hs --noline -D__HASKELL98__ "$@"
</pre>
</div>
<p>
Make it executable:
</p>
<div class="org-src-container">
<pre class="src src-sh">chmod +x hugs-cpphs
</pre>
</div>
<p>
Okay, let’s first pre-process the GreenCard sources with <code>cpphs</code>. To do that I ran the following commands:
</p>
<div class="org-src-container">
<pre class="src src-sh">cd ${NHCDIR}/src/greencard
CPPPRE="${NHCDIR}/hugs-cpphs -D__NHC__"
FILES="DIS.lhs \
HandLex.hs \
ParseLib.hs \
HandParse.hs \
FillIn.lhs \
Proc.lhs \
NHCBackend.hs \
NameSupply.lhs \
Process.lhs"
for file in $FILES; do
cp $file $file.original && $CPPPRE $file.original > $file && rm $file.original
done
</pre>
</div>
<p>
The result is a bunch of GreenCard source files without these pesky CPP pre-processor directives. Hugs can pre-process sources on the fly, but this makes evaluation orders of magnitude slower. Pre-processing the sources once before using them repeatedly seems like a better choice.
</p>
<p>
There is still a minor problem with GreenCard on Hugs. The GreenCard sources import the module <code>NonStdTrace</code>, which depends on built-in prelude functions from nhc98. Obviously, they are not available when running on Hugs (it has its own prelude implementation), so we need to provide an alternative using just the regular Hugs prelude. The following snippet creates a file named <code>src/prelude/NonStd/NonStdTraceBootstrap.hs</code> with the necessary changes.
</p>
<div class="org-src-container">
<pre class="src src-sh">cd ${NHCDIR}/src/prelude/NonStd/
sed -e 's|NonStdTrace|NonStdTraceBootstrap|' \
-e 's|import PreludeBuiltin||' \
-e 's|_||g' NonStdTrace.hs > NonStdTraceBootstrap.hs
</pre>
</div>
<p>
Then we change a single line in <code>src/greencard/NHCBackend.hs</code> to make it import <code>NonStdTraceBootstrap</code> instead of <code>NonStdTrace</code>.
</p>
<div class="org-src-container">
<pre class="src src-sh">cd ${NHCDIR}/src/greencard
sed -i -e 's|NonStdTrace|NonStdTraceBootstrap|' NHCBackend.hs
</pre>
</div>
<p>
To run GreenCard we still need a driver script. Let’s call this <code>hugs-greencard</code> and place it in <code>${NHCDIR}</code>:
</p>
<div class="org-src-container">
<pre class="src src-sh">#!/bin/bash
HUGSDIR="$(dirname $(readlink -f $(which runhugs)))/../"
SEARCH_HUGS=$(printf "${NHCDIR}/src/%s/*:" compiler prelude libraries)
runhugs -98 \
-P${HUGSDIR}/lib/hugs/packages/*:${NHCDIR}/include/*:${SEARCH_HUGS} \
${NHCDIR}/src/greencard/GreenCard.lhs \
$@
</pre>
</div>
<p>
Make it executable:
</p>
<div class="org-src-container">
<pre class="src src-sh">cd ${NHCDIR}
chmod +x hugs-greencard
</pre>
</div>
</div>
</div>
<div id="outline-container-org9519691" class="outline-2">
<h2 id="org9519691">Building the prelude!</h2>
<div id="text-7" class="outline-text-2">
<p>
Where were we? Ah, the prelude. As stated earlier, we need a working replacement for <code>nhc98comp</code>, which will be called by the driver script <code>script/nhc98</code> (created by the configure script). Let’s call the replacement <code>hugs-nhc</code>, and again we’ll dump it in <code>${NHCDIR}</code>. Here it is in all its glory:
</p>
<div class="org-src-container">
<pre class="src src-sh">#!/bin/bash
# Root directory of Hugs installation
HUGSDIR="$(dirname $(readlink -f $(which runhugs)))/../"
# TODO: "libraries" alone may be sufficient
SEARCH_HUGS=$(printf "${NHCDIR}/src/%s/*:" compiler prelude libraries)
# Filter everything from "+RTS" to "-RTS" from $@ because MainNhc98.hs
# does not know what to do with these flags.
ARGS=""
SKIP="false"
for arg in "$@"; do
if [[ $arg == "+RTS" ]]; then
SKIP="true"
elif [[ $arg == "-RTS" ]]; then
SKIP="false"
elif [[ $SKIP == "false" ]]; then
ARGS="${ARGS} $arg"
fi
done
runhugs -98 \
-P${HUGSDIR}/lib/hugs/packages/*:${SEARCH_HUGS} \
${NHCDIR}/src/compiler98/MainNhc98.hs \
$ARGS
</pre>
</div>
<p>
All this does is run Hugs (<code>runhugs</code>) with language extensions (<code>-98</code>), ensures that Hugs knows where to look for Hugs and nhc98 modules (<code>-P</code>), loads up the compiler’s <code>main</code> function, and then passes any arguments other than RTS flags (<code>$ARGS</code>) to it.
</p>
<p>
Let’s also make this executable:
</p>
<div class="org-src-container">
<pre class="src src-sh">cd ${NHCDIR}
chmod +x hugs-nhc
</pre>
</div>
<p>
The compiler sources contain pre-processor directives, which need to be removed before running <code>hugs-nhc</code>. It would be foolish to let Hugs pre-process the sources at runtime with <code>-F</code>. In my tests it made <code>hugs-nhc</code> run slower by an order of magnitude. Let’s pre-process the sources of the compiler and the libraries it depends on with <code>hugs-cpphs</code> (see above):
</p>
<div class="org-src-container">
<pre class="src src-sh">cd ${NHCDIR}
CPPPRE="${NHCDIR}/hugs-cpphs -D__HUGS__"
FILES="src/compiler98/GcodeLowC.hs \
src/libraries/filepath/System/FilePath.hs \
src/libraries/filepath/System/FilePath/Posix.hs"
for file in $FILES; do
cp $file $file.original && $CPPPRE $file.original > $file && rm $file.original
done
</pre>
</div>
<p>
The compiler’s driver script <code>script/nhc98</code> expects to find the executables of <code>hmake-PRAGMA</code>, <code>greencard-nhc98</code>, and <code>cpphs</code> in the architecture-specific lib directory (in my case that’s <code>${NHCDIR}/lib/x86_64-Linux/</code>). They do not exist, obviously, but for two of them we already have scripts to run them on top of Hugs. <code>hmake-PRAGMA</code> does not seem to be very important; replacing it with <code>cat</code> appears to be fine. To pacify the compiler script it’s easiest to just replace a few definitions:
</p>
<div class="org-src-container">
<pre class="src src-sh">cd ${NHCDIR}
sed -i \
-e '0,/^GREENCARD=.*$/s||GREENCARD="$NHC98BINDIR/../hugs-greencard"|' \
-e '0,/^CPPHS=.*$/s||CPPHS="$NHC98BINDIR/../hugs-cpphs -D__NHC__"|' \
-e '0,/^PRAGMA=.*$/s||PRAGMA=cat|' \
script/nhc98
</pre>
</div>
<p>
Initially, this looked like it would be enough, but half-way through building the prelude Hugs choked when interpreting nhc98 to build a certain module. After some experimentation it turned out that the <code>NHC.FFI</code> module in <code>src/prelude/FFI/CTypes.hs</code> is too big for Hugs. Running nhc98 on that module causes Hugs to abort with an overflow in the control stack. The fix here is to break up the module to make it easier for nhc98 to build it, which in turn prevents Hugs from doing too much work at once.
</p>
<p>
Apply this patch:
</p>
<pre class="example">
From 9eb2a2066eb9f93e60e447aab28479af6c8b9759 Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <rekado@elephly.net>
Date: Sat, 7 Jan 2017 22:31:41 +0100
Subject: [PATCH] Split up CTypes
This is necessary to avoid a control stack overflow in Hugs when
building the FFI library with nhc98 running on Hugs.
---
src/prelude/FFI/CStrings.hs | 2 ++
src/prelude/FFI/CTypes.hs | 14 --------------
src/prelude/FFI/CTypes1.hs | 20 ++++++++++++++++++++
src/prelude/FFI/CTypes2.hs | 22 ++++++++++++++++++++++
src/prelude/FFI/CTypesExtra.hs | 2 ++
src/prelude/FFI/FFI.hs | 2 ++
src/prelude/FFI/Makefile | 8 ++++----
src/prelude/FFI/MarshalAlloc.hs | 2 ++
src/prelude/FFI/MarshalUtils.hs | 2 ++
9 files changed, 56 insertions(+), 18 deletions(-)
create mode 100644 src/prelude/FFI/CTypes1.hs
create mode 100644 src/prelude/FFI/CTypes2.hs
diff --git a/src/prelude/FFI/CStrings.hs b/src/prelude/FFI/CStrings.hs
index 18fdfa9..f1373cf 100644
--- a/src/prelude/FFI/CStrings.hs
+++ b/src/prelude/FFI/CStrings.hs
@@ -23,6 +23,8 @@ module NHC.FFI (
import MarshalArray
import CTypes
+import CTypes1
+import CTypes2
import Ptr
import Word
import Char
diff --git a/src/prelude/FFI/CTypes.hs b/src/prelude/FFI/CTypes.hs
index 18e9d60..942e7a1 100644
--- a/src/prelude/FFI/CTypes.hs
+++ b/src/prelude/FFI/CTypes.hs
@@ -4,11 +4,6 @@ module NHC.FFI
-- Typeable, Storable, Bounded, Real, Integral, Bits
CChar(..), CSChar(..), CUChar(..)
, CShort(..), CUShort(..), CInt(..), CUInt(..)
- , CLong(..), CULong(..), CLLong(..), CULLong(..)
-
- -- Floating types, instances of: Eq, Ord, Num, Read, Show, Enum,
- -- Typeable, Storable, Real, Fractional, Floating, RealFrac, RealFloat
- , CFloat(..), CDouble(..), CLDouble(..)
) where
import NonStdUnsafeCoerce
@@ -29,12 +24,3 @@ INTEGRAL_TYPE(CShort,Int16)
INTEGRAL_TYPE(CUShort,Word16)
INTEGRAL_TYPE(CInt,Int)
INTEGRAL_TYPE(CUInt,Word32)
-INTEGRAL_TYPE(CLong,Int32)
-INTEGRAL_TYPE(CULong,Word32)
-INTEGRAL_TYPE(CLLong,Int64)
-INTEGRAL_TYPE(CULLong,Word64)
-
-FLOATING_TYPE(CFloat,Float)
-FLOATING_TYPE(CDouble,Double)
--- HACK: Currently no long double in the FFI, so we simply re-use double
-FLOATING_TYPE(CLDouble,Double)
diff --git a/src/prelude/FFI/CTypes1.hs b/src/prelude/FFI/CTypes1.hs
new file mode 100644
index 0000000..81ba0f5
--- /dev/null
+++ b/src/prelude/FFI/CTypes1.hs
@@ -0,0 +1,20 @@
+{-# OPTIONS_COMPILE -cpp #-}
+module NHC.FFI
+ ( CLong(..), CULong(..), CLLong(..), CULLong(..)
+ ) where
+
+import NonStdUnsafeCoerce
+import Int ( Int8, Int16, Int32, Int64 )
+import Word ( Word8, Word16, Word32, Word64 )
+import Storable ( Storable(..) )
+-- import Data.Bits( Bits(..) )
+-- import NHC.SizedTypes
+import Monad ( liftM )
+import Ptr ( castPtr )
+
+#include "CTypes.h"
+
+INTEGRAL_TYPE(CLong,Int32)
+INTEGRAL_TYPE(CULong,Word32)
+INTEGRAL_TYPE(CLLong,Int64)
+INTEGRAL_TYPE(CULLong,Word64)
diff --git a/src/prelude/FFI/CTypes2.hs b/src/prelude/FFI/CTypes2.hs
new file mode 100644
index 0000000..7d66242
--- /dev/null
+++ b/src/prelude/FFI/CTypes2.hs
@@ -0,0 +1,22 @@
+{-# OPTIONS_COMPILE -cpp #-}
+module NHC.FFI
+ ( -- Floating types, instances of: Eq, Ord, Num, Read, Show, Enum,
+ -- Typeable, Storable, Real, Fractional, Floating, RealFrac, RealFloat
+ CFloat(..), CDouble(..), CLDouble(..)
+ ) where
+
+import NonStdUnsafeCoerce
+import Int ( Int8, Int16, Int32, Int64 )
+import Word ( Word8, Word16, Word32, Word64 )
+import Storable ( Storable(..) )
+-- import Data.Bits( Bits(..) )
+-- import NHC.SizedTypes
+import Monad ( liftM )
+import Ptr ( castPtr )
+
+#include "CTypes.h"
+
+FLOATING_TYPE(CFloat,Float)
+FLOATING_TYPE(CDouble,Double)
+-- HACK: Currently no long double in the FFI, so we simply re-use double
+FLOATING_TYPE(CLDouble,Double)
diff --git a/src/prelude/FFI/CTypesExtra.hs b/src/prelude/FFI/CTypesExtra.hs
index ba3f15b..7cbdcbb 100644
--- a/src/prelude/FFI/CTypesExtra.hs
+++ b/src/prelude/FFI/CTypesExtra.hs
@@ -20,6 +20,8 @@ import Storable ( Storable(..) )
import Monad ( liftM )
import Ptr ( castPtr )
import CTypes
+import CTypes1
+import CTypes2
#include "CTypes.h"
diff --git a/src/prelude/FFI/FFI.hs b/src/prelude/FFI/FFI.hs
index 9d91e57..0c29394 100644
--- a/src/prelude/FFI/FFI.hs
+++ b/src/prelude/FFI/FFI.hs
@@ -217,6 +217,8 @@ import MarshalUtils -- routines for basic marshalling
import MarshalError -- routines for basic error-handling
import CTypes -- newtypes for various C basic types
+import CTypes1
+import CTypes2
import CTypesExtra -- types for various extra C types
import CStrings -- C pointer to array of char
import CString -- nhc98-only
diff --git a/src/prelude/FFI/Makefile b/src/prelude/FFI/Makefile
index 99065f8..e229672 100644
--- a/src/prelude/FFI/Makefile
+++ b/src/prelude/FFI/Makefile
@@ -18,7 +18,7 @@ EXTRA_C_FLAGS =
SRCS = \
Addr.hs Ptr.hs FunPtr.hs Storable.hs \
ForeignObj.hs ForeignPtr.hs Int.hs Word.hs \
- CError.hs CTypes.hs CTypesExtra.hs CStrings.hs \
+ CError.hs CTypes.hs CTypes1.hs CTypes2.hs CTypesExtra.hs CStrings.hs \
MarshalAlloc.hs MarshalArray.hs MarshalError.hs MarshalUtils.hs \
StablePtr.hs
@@ -38,12 +38,12 @@ Word.hs: Word.hs.cpp
# dependencies generated by hmake -Md: (and hacked by MW)
${OBJDIR}/MarshalError.$O: ${OBJDIR}/Ptr.$O
${OBJDIR}/MarshalUtils.$O: ${OBJDIR}/Ptr.$O ${OBJDIR}/Storable.$O \
- ${OBJDIR}/MarshalAlloc.$O ${OBJDIR}/CTypes.$O ${OBJDIR}/CTypesExtra.$O
+ ${OBJDIR}/MarshalAlloc.$O ${OBJDIR}/CTypes.$O ${OBJDIR}/CTypes1.$O ${OBJDIR}/CTypes2.$O ${OBJDIR}/CTypesExtra.$O
${OBJDIR}/MarshalArray.$O: ${OBJDIR}/Ptr.$O ${OBJDIR}/Storable.$O \
${OBJDIR}/MarshalAlloc.$O ${OBJDIR}/MarshalUtils.$O
-${OBJDIR}/CTypesExtra.$O: ${OBJDIR}/Int.$O ${OBJDIR}/Word.$O ${OBJDIR}/CTypes.$O
+${OBJDIR}/CTypesExtra.$O: ${OBJDIR}/Int.$O ${OBJDIR}/Word.$O ${OBJDIR}/CTypes.$O ${OBJDIR}/CTypes1.$O ${OBJDIR}/CTypes2.$O
${OBJDIR}/CTypes.$O: ${OBJDIR}/Int.$O ${OBJDIR}/Word.$O ${OBJDIR}/Storable.$O \
- ${OBJDIR}/Ptr.$O
+ ${OBJDIR}/Ptr.$O ${OBJDIR}/CTypes1.$O ${OBJDIR}/CTypes2.$O
${OBJDIR}/CStrings.$O: ${OBJDIR}/MarshalArray.$O ${OBJDIR}/CTypes.$O \
${OBJDIR}/Ptr.$O ${OBJDIR}/Word.$O
${OBJDIR}/MarshalAlloc.$O: ${OBJDIR}/Ptr.$O ${OBJDIR}/Storable.$O \
diff --git a/src/prelude/FFI/MarshalAlloc.hs b/src/prelude/FFI/MarshalAlloc.hs
index 34ac7b3..5b43554 100644
--- a/src/prelude/FFI/MarshalAlloc.hs
+++ b/src/prelude/FFI/MarshalAlloc.hs
@@ -14,6 +14,8 @@ import ForeignPtr (FinalizerPtr(..))
import Storable
import CError
import CTypes
+import CTypes1
+import CTypes2
import CTypesExtra (CSize)
import NHC.DErrNo
diff --git a/src/prelude/FFI/MarshalUtils.hs b/src/prelude/FFI/MarshalUtils.hs
index 312719b..bd9d149 100644
--- a/src/prelude/FFI/MarshalUtils.hs
+++ b/src/prelude/FFI/MarshalUtils.hs
@@ -29,6 +29,8 @@ import Ptr
import Storable
import MarshalAlloc
import CTypes
+import CTypes1
+import CTypes2
import CTypesExtra
-- combined allocation and marshalling
--
2.11.0
</pre>
<p>
After all this it’s time for a break. Run the following commands for a long break:
</p>
<div class="org-src-container">
<pre class="src src-sh">cd ${NHCDIR}/src/prelude
time make NHC98COMP=$NHCDIR/hugs-nhc
</pre>
</div>
<p>
After the break—it took more than two hours on my laptop—you should see output like this:
</p>
<pre class="example">
ranlib /path/to/nhc98-1.22/lib/x86_64-Linux/Prelude.a
</pre>
<p>
Congratulations! You now have a native nhc98 prelude library!
</p>
</div>
</div>
<div id="outline-container-orgace25de" class="outline-2">
<h2 id="orgace25de">Building hmake</h2>
<div id="text-8" class="outline-text-2">
<p>
The compiler and additional Haskell libraries all require a tool called “hmake” to automatically order dependencies, so we’ll try to build it next. There’s just a small problem with one of the source files: <code>src/hmake/FileName.hs</code> contains the name “Niklas Röjemo” and the compiler really does not like the umlaut. With apologies to Niklas we change the copyright line to appease the compiler.
</p>
<div class="org-src-container">
<pre class="src src-sh">cd $NHCDIR/src/hmake
mv FileName.hs{,.broken}
tr '\366' 'o' < FileName.hs.broken > FileName.hs
rm FileName.hs.broken
NHC98COMP=$NHCDIR/hugs-nhc make HC=$NHCDIR/script/nhc98
</pre>
</div>
</div>
</div>
<div id="outline-container-org83aa24b" class="outline-2">
<h2 id="org83aa24b">To be continued</h2>
<div id="text-9" class="outline-text-2">
<p>
Unfortunately, the hmake tools are not working. All of the tools (e.g. <code>MkConfig</code>) fail with an early segmentation fault. There must be an error in the runtime, likely in <code>src/runtime/Kernel/mutator.c</code> where bytecode for heap and stack operations is interpreted. One thing that looks like a problem is statements like this:
</p>
<pre class="example">
*--sp = (NodePtr) constptr[-HEAPOFFSET(ip[0])];
</pre>
<p>
<code>constptr</code> is <code>NULL</code>, so this seems to be just pointer arithmetic expressed in array notation. These errors can be fixed by rewriting the statement to use explicit pointer arithmetic:
</p>
<pre class="example">
*--sp = (NodePtr) (constptr + (-HEAPOFFSET(ip[0])));
</pre>
<p>
Unfortunately, this doesn’t seem to be enough as there is another segfault in the handling of the <code>EvalTOS</code> label. <code>IND_REMOVE</code> is applied to the contents of the stack pointer, which turns out to be <code>0x10</code>, which just doesn’t seem right. <code>IND_REMOVE</code> removes indirection by following pointer addresses until the value stored at the given address does not look like an address. This fails because <code>0x10</code> does look like an address—it’s just invalid. I have enabled a bunch of tracing and debugging features, but I don’t fully understand how the nhc98 runtime is <i>supposed</i> to work.
</p>
<p>
Judging from mails on the nhc-bugs and nhc-users lists I see that I’m not the only one experiencing segfaults. <a href="https://mail.haskell.org/pipermail/nhc-bugs/2005-October/000529.html">This email</a> suggests that segfaults are “associated with changes in the way gcc lays out static arrays of bytecodes, e.g. by putting extra padding space between arrays that are supposed to be adjacent.” I may have to try different compiler flags or an older version of GCC; I only tried with GCC 4.9.4 but the <a href="http://sources.debian.net/src/nhc98/1.16-15/debian/rules/">Debian package for nhc98</a> used version 2.95 or 3.3.
</p>
<p>
For completeness sake here’s the trace of the failing execution of MkProg:
</p>
<pre class="example">
(gdb) run
Starting program: /path/to/nhc98-1.22/lib/x86_64-Linux/MkProg
ZAP_ARG_I1 hp=0x80c5010 sp=0x8136a10 fp=0x8136a10 ip=0x8085140
NEEDHEAP_I32 hp=0x80c5010 sp=0x8136a10 fp=0x8136a10 ip=0x8085141
HEAP_CVAL_N1 hp=0x80c5010 sp=0x8136a10 fp=0x8136a10 ip=0x8085142
HEAP_CVAL_I3 hp=0x80c5014 sp=0x8136a10 fp=0x8136a10 ip=0x8085144
HEAP_OFF_N1 hp=0x80c5018 sp=0x8136a10 fp=0x8136a10 ip=0x8085145
PUSH_HEAP hp=0x80c501c sp=0x8136a10 fp=0x8136a10 ip=0x8085147
HEAP_CVAL_I4 hp=0x80c501c sp=0x8136a0c fp=0x8136a10 ip=0x8085148
HEAP_CVAL_I5 hp=0x80c5020 sp=0x8136a0c fp=0x8136a10 ip=0x8085149
HEAP_OFF_N1 hp=0x80c5024 sp=0x8136a0c fp=0x8136a10 ip=0x808514a
PUSH_CVAL_P1 hp=0x80c5028 sp=0x8136a0c fp=0x8136a10 ip=0x808514c
PUSH_I1 hp=0x80c5028 sp=0x8136a08 fp=0x8136a10 ip=0x808514e
ZAP_STACK_P1 hp=0x80c5028 sp=0x8136a04 fp=0x8136a10 ip=0x808514f
EVAL hp=0x80c5028 sp=0x8136a04 fp=0x8136a10 ip=0x8085151
eval: evalToS
Program received signal SIGSEGV, Segmentation fault.
0x0804ac27 in run (toplevel=0x80c5008) at mutator.c:425
425 IND_REMOVE(nodeptr);
</pre>
<p>
<code>ip</code> is the instruction pointer, which points at the current element in the bytecode stream. <code>fp</code> is probably the frame pointer, <code>sp</code> the stack pointer, and <code>hp</code> the heap pointer. The <a href="https://www.haskell.org/nhc98/implementation-notes/index.html">implementation notes for nhc98</a> will probably be helpful in solving this problem.
</p>
<p>
Anyway, that’s where I’m at so far. If you are interested in these kinds of problems or other bootstrapping projects, consider joining the efforts of the <a href="http://bootstrappable.org">Bootstrappable Builds project</a>!
</p>
</div>
</div>
<div id="footnotes">
<h2 class="footnotes">Footnotes: </h2>
<div id="text-footnotes">
<div class="footdef"><sup><a id="fn.1" href="#fnr.1" class="footnum">1</a></sup> <div class="footpara">It is unclear when exactly the release was made, but any time between 1991 and 1993 seems likely.</div></div>
</div>
</div></div>#ilovefs: Why GNU Emacs?https://elephly.net/posts/2016-02-14-ilovefs-emacs.htmlRicardo Wurmusrekado+web@elephly.net2016-02-14T22:00:00Z<h2>Why write about Emacs?</h2><p>I don’t usually try to explain tools that I use to other people,
unless they made an explicit request as to how they could improve
their workflow. However, since <a href="https://fsfe.org/campaigns/ilovefs/2016/">today is “I love Free
Software” Day</a> I think I should seize this opportunity and
explain what draws me to <a href="http://gnu.org/software/emacs">GNU Emacs</a> and how I use it.</p><p>Sometimes people who use computers ask me why I would use
something as “bloated” as Emacs for text editing. Usually they
remark that Emacs is a hold-over from a by-gone era, much too
large compared to editors like “vi”, and that they are quite
content using a variant of vi or some Notepad-like editor. They
may have heard that you can play Tetris inside of Emacs and you
can tell that they have difficulties hiding the fact that they
are mildly disgusted by this abomination, a tool that seems to
ignorantly contradict the Unix philosophy of doing just one thing
and doing it well.</p><h2>Embracing the operating system</h2><p>I cannot help but notice that there’s a misunderstanding; at the
very least there’s an invalid assumption, namely that we agree on
terminology. I do not consider Emacs a mere “editor”. To some
this is folk wisdom, a now blunt blade used by the old warriors
in the <a href="https://en.wikipedia.org/wiki/Editor_war">editor
wars</a> of ancient history:</p><blockquote>Emacs is a great operating system, lacking only a decent editor.</blockquote><p>I contest the second part (as Emacs has a multitude of decent
editor modes, even for fans of vi), but I do agree with the
hyperbolic first part: yes, an operating system indeed!</p><p>Maybe not quite in the sense of the GNU operating system, but
certainly in the sense that it is a platform to run applications.
In fact, it is a platform very much like a modern web browser
resembles an application platform more than it does a mere HTML
document viewer.</p><p>Just like a browser is used by many as a platform for running
applications operating on some HTML document, Emacs is a platform
for anything that can “reasonably” (this is up for
interpretation) be mapped to buffers of text. Applications in
browsers are written in JavaScript, applications in Emacs are
written in EmacsLisp (also called “elisp”).</p><h2>The universal text environment</h2><p>A text buffer in Emacs could hold the trail of a shell session
(<code>shell-mode</code>), an email (<code>message-mode</code>), a TODO
list (<code>org-mode</code>), a directory listing (<code>dired</code>),
a text file on disk, a chat session (<code>ERC</code>), a web page
(<code>eww</code>), the output produced by an external command, etc.
Just like a modern web browser represents an environment in which
a programming language can be used to manipulate and interact
with HTML documents, Emacs is an environment for text buffers
with a language that can be used to manipulate and interact with
text buffers.</p><p>If you have used your web browser (or have observed someone use
their web browser) to play games, listen to music, watch videos,
read and compose email, edit text (e.g. by contributing to the
Wikipedia), chat with friends (or chat about foes), read
documentation, installed an extension—well, then the notion
of a generic tool as a platform should not be a foreign concept
to you. Emacs can be understood as such a generic tool providing
a text interface (one of which may be a file editor).</p><h2>Living in Emacs</h2><p>Emacs is my main user agent—it acts as an assistant on my
behalf in all matters relating to text—, much like the
browser is the main user agent for documents and applications on
the web to many people. This is why I hardly remember when I
last closed Emacs. I do not start Emacs to edit a file; I’m
living in Emacs.</p><p>Not only am I’m writing this blog post in Emacs (obviously!), I’m
also keeping track of multiple conversations on IRC in separate
buffers; I’m <a href="http://www.djcbsoftware.nl/code/mu/mu4e.html">reading and composing email</a>; I manage my GNU Guix software
profiles with a dedicated Emacs mode; I deal with Git through a
<a href="http://magit.vc/">convenient two-dimensional text-based
user interface</a> rather than using the one-dimensional, terse
command line interface; when I view man pages I use <a href="https://www.gnu.org/software/emacs/manual/html_mono/woman.html">woman</a>, which greatly enhances man page navigation; of course I
use Emacs as an Info documentation browser as well; my shell
sessions are in Emacs thanks to <code>shell-mode</code>—I’m not
one of those who run Emacs in a shell session inside a terminal
emulator—; I view pretty PDF documents in Emacs buffers with
<a href="https://github.com/politza/pdf-tools">PDF tools</a>, and
even my complete personal organisation and calendar needs are
satisfied by an application running in Emacs (see <a href="http://orgmode.org">Org mode</a>).</p><h2>There is no <em>one</em> Emacs</h2><p>What is crucial to understand is that Emacs is not one and the
same thing to any two Emacs users. It is malleable and
accessible thanks to being written in EmacsLisp. When ogres are
like onions, Emacs is probably like a giant cherry: a small solid
core (written in C) and a delicious mantle of sweet EmacsLisp
(analogies are not my strong suit). Since almost every
conceivable feature provided by Emacs is accessible through
EmacsLisp and can be tweaked, rewired, or fully replaced, Emacs
becomes what you want it to be.</p><p>I probably could not use an Emacs instance that has been shaped
by the habits of another hacker, and they probably also wouldn’t
be happy with my configuration. It’s like a tailor-made shirt in
that it fits you exactly (if you take some time to take your
measurements), yet it also fits like the most comfortable sweat
pants as it won’t punish you if you change your sporty habits and
gain weight.</p><h2>What’s GNU? GNU’s Not Unix!</h2><p>This leads me to the last point I wanted to address: the claim
that Emacs is bloated and ignores the Unix philosophy of doing
only one thing and doing it well. I don’t know what “bloated”
really means. Emacs does come with a lot of features but this
doesn’t make it bloated.</p><p>I think this claim is rooted in another misunderstanding. When
you have a terminal emulator open in which you run a shell
session (like bash), and you run a command like <code>ls</code>, you
would not consider the shell to be bloated to allow you to
interact seamlessly with external commands. Likewise you
probably don’t object to builtin commands that cannot easily be
expressed with external executables or that make the shell more
convenient to use.</p><p>Similarly, Emacs is the perfect glue between different text-based
applications. When I run a shell inside of Emacs, what Emacs
really does is spawn an external shell process and redirect input
and output to talk to it transparently. Or when I read email in
mu4e the mail directory and its indexing database are not part of
Emacs. Or when I read PDFs they are actually rendered by a
separate process. Since many of these features are provided by
optional extensions there really isn’t much to the claim that
Emacs is bloated.</p><p>However, it is true that Emacs does not blindly subscribe to the
Unix philosophy. One of its previous logos (my favourite) was an
<a href="http://www.emacswiki.org/pics/static/KitchenSinkWhite.png">overflowing kitchen sink</a>, acknowledging the fact that Emacs
rather errs on the side of including more features rather than
fewer when it is convenient. The goal of the GNU project never
was to merely provide a free clone of proprietary Unices, but to
give users <a href="https://www.gnu.org/philosophy/free-sw.html">software freedom</a>. In the case of Emacs the boundary between
user and programmer is blurred as adapting the environment to
one’s needs is <a href="https://www.gnu.org/software/emacs/emacs-paper.html">already an
act of programming with a very low barrier to entry</a>. Emacs
provides <em>practical</em> software freedom and that’s one of the
main reasons why over the course of many years my perception of
it has slowly shifted from a belittled tool only old-fashioned
people use to the centre-piece of most of my daily computing
activities.</p><p>Yay for GNU Emacs, yay for free software!</p>Getting started with GNU Guixhttps://elephly.net/posts/2015-06-21-getting-started-with-guix.htmlRicardo Wurmusrekado+web@elephly.net2015-06-21T00:00:00Z<warning><strong>Feb 24, 2016:</strong>
This post has been updated for the 0.9.0 release of GNU Guix.</warning><p><a href="/posts/2015-04-17-gnu-guix.html">Previously I wrote</a>
about how using GNU Guix in an HPC environment enables easy
software deployment for multiple users with different needs when
it comes to application and library versions. Although Guix comes
with an excellent manual which is also <a href="https://www.gnu.org/software/guix/manual/guix.html">available
online</a>, some people may want to have just some simple
installation instructions in one place and some pointers to get
started. I’m attempting to provide just that with this article.</p><p>While Guix can be built from source it is much more convenient to
use the self-contained tarball which provides pre-built binaries
for Guix and all its dependencies. You need to have GNU tar and
xz installed to unpack the tarball. Note that the tarball will
only work on GNU/Linux systems; it will not work on MacOS.</p><p>Guix needs a little bit of setting up, which can be done in just
a couple of steps.</p><h2>Download and check</h2><p><em>First</em>, if you are using a 64 bit machine, download the
compressed <a href="ftp://alpha.gnu.org/gnu/guix/guix-binary-0.9.0.x86_64-linux.tar.xz">x86_64 archive from the FTP server</a>. There also is a <a href="ftp://alpha.gnu.org/gnu/guix/guix-binary-0.9.0.i686-linux.tar.xz">tarball for 32 bit machines</a> and <a href="ftp://alpha.gnu.org/gnu/guix/">for other architectures</a>.</p><p>For your own sake you really should also download the matching
cryptographic signature file (they all have the same name as the
archive you downloaded, but end on <code>.sig</code>) to ensure that
the tarballs are signed by release managers. Releases up to now
were signed by <a href="https://pgp.mit.edu/pks/lookup?op=vindex&search=0x090B11993D9AEBB5">Ludovic Courtès</a>. I suggest you fetch both Ludo's and my own
PGP key from PGP key servers, for example by doing this:</p><pre><code># gpg2 --recv-keys 090b11993d9aebb5 197a5888235facac</code></pre><p>You only need to do this once. With these keys you can now check
that the file you downloaded is in fact legit. To verify that
the file is indeed signed by the release manager and the
signature is valid following command in the same directory that
holds the tarball and the signature file:</p><pre><code># gpg2 --verify guix-binary-0.9.0.x86_64-linux.tar.xz.sig</code></pre><p>If you see something like “Good signature from "Ludovic Courtès
<ludo@gnu.org>” you’re safe (according to your trust in the keys you
downloaded).</p><h2>Unpacking the archive</h2><p><em>Second</em>, unpack the archive as root in the root directory:</p><pre><code># cd /
# tar xf guix-binary-0.9.0.SYSTEM.tar.xz</code></pre><p>This creates a pre-populated store at <code>/gnu/store</code>
(containing the “guix” package and the complete dependency graph),
the <em>local state directory</em> <code>/var/guix</code>, and a Guix
profile for the root user at <code>/root/.guix-profile</code>, which
contains the guix command line tools and the daemon.</p><h2>Create dedicated build users</h2><p><em>Third</em>, create a build user pool, as root:</p><pre><code># groupadd --system guix-builder
# for i in `seq 1 10`;
do
useradd -g guix-builder -G guix-builder
-d /var/empty -s `which nologin`
-c "Guix build user $i" --system
guix-builder$i;
done</code></pre><p>These are the restricted user accounts which are used by the
daemon to build software in a controlled environment. You may not
need ten, but it’s a good default.</p><h2>Run the build daemon</h2><p><em>Fourth</em>, run the daemon and tell it about the <code>guix-builder</code> group:</p><pre><code># /root/.guix-profile/bin/guix-daemon --build-users-group=guix-builder</code></pre><p>Note that this is a server process, so it will never return. I
suggest turning this into a system service and keep it running in
the background at all times. The archive unpacks a Systemd
service file to <code>/gnu/store/632msbms2yald...-guix-0.9.0/lib/systemd/system/guix-daemon.service</code>,
which you can just copy to <code>/etc/systemd/system/</code>; run
the following commands to start and enable the service:</p><pre><code># systemctl daemon-reload
# systemctl enable guix-daemon
# systemctl start guix-daemon</code></pre><p>The daemon is responsible to handle build requests from users, so
it is essential that it keeps running.</p><p>Since building all software locally can take a very long time,
the GNU Guix build farm hydra.gnu.org is by default authorised as
a source for so-called binary substitutes.</p><p>Note that hydra.gnu.org isn’t at all special. Packages are built
there continuously from source. Guix is flexible and can pull
binary substitutes from other locations as long as you authorise
them. Check the Guix Info manual for more information about
substitutes.</p><h2>Guix for everyone</h2><p><em>Fifth</em>, make the <code>guix</code> command available to other users
on the machine by linking it to a location everyone can access,
such as <code>/usr/local/bin</code>.</p><pre><code># mkdir -p /usr/local/bin
# cd /usr/local/bin
# ln -s /root/.guix-profile/bin/guix</code></pre><p>Now any user—not just the almighty root—can install software by
invoking <code>guix package -i whatever</code>. Yay!</p><h2>Where to go from here</h2><p>Congratulations! You now have a fully functional installation of
the Guix package manager.</p><p>To get the latest package recipes for Guix just run <code>guix
pull</code>, which will download and compile the most recent
development version for the current user. This allows users
(including root) to all have a different version of Guix.</p><p>I recommend reading the excellent Guix reference manual, which is
<a href="https://www.gnu.org/software/guix/manual/guix.html">available on the web</a> and, of course, included as an Info
document in your Guix installation. If you don’t have Emacs—the
best Info reader, which also happens to be an excellent text
editor—I encourage you to install it from Guix; it is just a
<code>guix package -i emacs</code> away!</p><p>If you have questions that are not covered by the manual feel free
to chat with members of the Guix community <a href="https://webchat.freenode.net/?channels=#guix">on IRC in the
#guix channel on Freenode</a>. For matters relating to using Guix
in a bioinformatics environment you are welcome to subscribe and
write to the <a href="http://lists.open-bio.org/mailman/listinfo/bio-packaging">mailing list bio-packaging@mailman.open-bio.org</a>.</p>GNU Guix in an HPC environmenthttps://elephly.net/posts/2015-04-17-gnu-guix.htmlRicardo Wurmusrekado+web@elephly.net2015-04-17T00:00:00Z<p>I spend my daytime hours as a system administrator at a research
institute in a heterogeneous computing environment. We have two
big compute clusters (one on CentOS the other on Ubuntu) with
about 100 nodes each and dozens of custom GNU/Linux workstations.
A common task for me is to ensure the users can run their
bioinformatics software, both on their workstation and on the
clusters. Only few bioinformatics tools and libraries are
popular enough to have been packaged for CentOS or Ubuntu, so
usually some work has to be done to build the applications and
all of their dependencies for the target platforms.</p><h2>How to waste time building and deploying software</h2><p>In theory, compiling software is not a very difficult thing to
do. Once all development headers have been installed on the
build host, compilation is usually a matter of configuring the
build with a configure script and running GNU make with various
flags (this is an assumption which is violated by bioinformatics
software on a regular basis, but let’s not get into this now).
However, there are practical problems that become painfully
obvious in a shared environment with a large number of users.</p><h3>Naive compilation</h3><p>Compiling software directly on the target machine is an option
only in the most trivial cases. With more complicated build
systems or complicated build-time dependencies there is a strong
incentive for system administrators to do the hard work of
setting up a suitable build environment for a particular piece of
software only once. Most people would agree that package
management is a great step up from naive compilation, as the
build steps are formalised in some sort of recipe that can be
executed by build tools in a reproducible manner. Updates to
software only require tweaks to these recipes. Package
management is a good thing.</p><h3>System-dependence</h3><p>Non-trivial software that was built and dynamically linked on one
machine with a particular set of libraries and header files at
particular versions can only really work on a system with the
very same libraries at compatible versions in place. Established
package managers allow packagers to specify hard dependencies and
version ranges, but the binaries that are produced on the build
host will only work under the constraints imposed on them at
build time. To support an environment in which software must run
on, say, both CentOS 6.5 and CentOS 7.1, the packages must be
built in both environments and binaries for both targets have to
be provided.</p><p>There are ways to emulate a different build environment
(e.g. Fedora’s <code>mockbuild</code>), but we cannot get around the
fact that dynamically linked software built for one kind of
system will only ever work on that very kind of system. At
runtime we can change what libraries will be dynamically loaded,
but this is a hack that pushes the problem from package
maintainers to users. Running software with <code>LD_LIBRARY_PATH</code> set is not a solution, nor is static linking,
the equivalent to copying chunks of libraries at build time.</p><h3>Version conflicts</h3><p>Libraries and applications that come pre-installed or
pre-packaged with the system may not be the versions a user
claims to need. Say, a user wants the latest version of GCC to
compile code using new language features specified in C++11
(e.g. anonymous functions). Full support for C++11 arrived in
GCC 4.8.1, yet on CentOS 6.5 only version 4.4.7 is available
through the repositories. The system administrator may not
necessarily be able to upgrade GCC system-wide. Or maybe other
users on a shared system do need version 4.4.7 to be available
(e.g. for bug-compatibility). There is no easy way to satisfy
all users, so a system administrator might give up and let users
build their own software in their home directories instead of
solving the problem.</p><p>However, compiling GCC is a daunting task for a user and they
really shouldn’t have to do this at all. We already established
that package management is a good thing; why should we deny users
the benefits of package management? Traditional package
management techniques are ill-suited to the task of installing
multiple versions of applications or libraries into independent
prefixes. RPM, for example, allows users to maintain a local,
independent package database, but <code>yum</code> won’t work with
multiple package databases. Additionally, only <em>one</em>
package database can be used at once, so a user would have to
re-install system libraries into the local package database to
satisfy dependencies. As a result, users lose the important
feature of automatic dependency resolution.</p><h3>Interoperability</h3><p>A system administrator who decides to package software as
relocatable RPMs, to install the applications to custom prefixes
and to maintain a separate repository has nothing to show for
when a user asks to have the packaged software installed on an
Ubuntu workstation. There are ways to convert RPMs to DEB
packages (with varying degrees of success), but it seems silly to
have to convert or rebuild stuff repeatedly when the software,
its dependencies and its mode of deployment really didn’t change
at all.</p><p>What happens when a Slackware user comes along next? Or someone
using Arch Linux? Sure, as a system administrator you could
refuse to support any system other than CentOS 7.1, users be
damned. Traditionally, it seems that system administrators
default to this style for convenience and/or practical reasons,
but I consider this unhelpful and even somewhat oppressive.</p><h2>Functional package management with GNU Guix</h2><p>Luckily, I’m not the only person to consider traditional
packaging methods inadequate for a number of valid purposes.
There are different projects aiming to improve and simplify
software deployment and management, one of which I will focus on
in this article. As a functional programmer, Scheme aficionado
and free software enthusiast I was intrigued to learn about <a href="https://www.gnu.org/software/guix/">GNU Guix</a>, a functional
package manager written in <a href="https://www.gnu.org/software/guile/">Guile Scheme</a>, the
designated extension language for the <a href="https://www.gnu.org/">GNU system</a>.</p><p>In purely functional programming languages a function will
produce the very same output when called repeatedly with the same
input values. This allows for interesting optimisation, but most
importantly it makes it <em>possible</em> and in some cases even
<em>easy</em> to reason about the behaviour of a function. It is
independent from global state, has no side effects, and its
outputs can be cached as they are certain not to change as long
as the inputs stay the same.</p><p>Functional package management lifts this concept to the realm of
software building and deployment. Global state in a system
equates to system-wide installations of software, libraries and
development headers. Side effects are changes to the global
environment or global system paths such as <code>/usr/bin/</code>.
To reject global state means to reject the common file system
hierarchy for software deployment and to use a minimal <code>chroot</code> for building software. The introduction of the Guix
manual describes the approach as follows:</p><blockquote><p>The term “functional” refers to a specific package management
discipline. In Guix, the package build and installation process
is seen as a function, in the mathematical sense. That function
takes inputs, such as build scripts, a compiler, and libraries,
and returns an installed package. As a pure function, its
result depends solely on its inputs—for instance, it cannot
refer to software or scripts that were not explicitly passed as
inputs. A build function always produces the same result when
passed a given set of inputs. It cannot alter the system’s
environment in any way; for instance, it cannot create, modify,
or delete files outside of its build and installation
directories. This is achieved by running build processes in
isolated environments (or “containers”), where only their
explicit inputs are visible.</p><p>The result of package build functions is “cached” in the file
system, in a special directory called “the store”. Each package
is installed in a directory of its own, in the store—by default
under ‘/gnu/store’. The directory name contains a hash of all
the inputs used to build that package; thus, changing an input
yields a different directory name.</p></blockquote><p>The following diagram (taken from the <a href="https://www.gnu.org/software/guix/guix-fosdem-20150131.pdf">slides for a talk by Ludovic Courtès</a>) illustrates how the
build daemon handles the package build processes requested by a
client via remote procedure calls:</p><img class="full stretch" src="/images/posts/2015/guix-build.png" alt="Software is built by the Guix daemon in isolation" /><h3>Isolated, yet shared</h3><p>Note that the package outputs are still dynamically linked.
Libraries are referenced in the binaries with their full store
paths using the runpath feature. These package outputs are no
self-contained, monolithic application directories as you might
know them from MacOS.</p><p>Any built software is cached in the store which is shared by all
users system-wide. However, by default the software in the store
has no effect whatsoever on the users’ environments. Building
software and have the results stored in <code>/gnu/store</code> does
not alter any global state; no files pollute <code>/usr/bin/</code>
or <code>/usr/lib/</code>. Any effects are restricted to the
package’s single output directory inside the <code>/gnu/store</code>.</p><p>Guix provides per-user profiles to map software from the store
into a user environment. The store provides deduplication as it
serves as a cache for packages that have already been built. A
profile is little more than a “forest” of symbolic links to items
in the store. The union of links to the outputs of all software
packages the user requested makes up the user’s profile. By
adding another layer of symbolic link indirection, Guix allows
users to seamlessly switch among different generations of the
same profile, going back in time.</p><p>Each user profile is completely isolated from one another, making
it possible for different users to have different versions of GCC
installed. Even one and the same user could have multiple
profiles with different versions of GCC and switch between them
as needed.</p><p>Guix takes the functional packaging method seriously, so except
for the running kernel and the exposed machine hardware there are
virtually no dependencies on global state (i.e. system libraries
or headers). This also means that the Guix store is populated
with the complete dependency tree, down to the kernel headers and
the C library. As a result, software in the Guix store can run
on very different GNU/Linux distributions; a shared Guix store
allows me to use the very same software on my Fedora workstation,
as well as on the Ubuntu cluster, and on the CentOS 6.5 cluster.</p><p>This means that software only has to be packaged up once. Since
package recipes are written in a very declarative domain-specific
language on top of Scheme, packaging is surprisingly simple (and
to this Schemer is rather enjoyable).</p><h3>User freedom</h3><p>Guix liberates users from the software deployment decisions of
their system administrators by giving them the power to build
software into an isolated directory in the store using simple
package recipes. Administrators only need to configure and run
the Guix daemon, the core piece running as root. The daemon
listens to requests issued by the Guix command line tool, which
can be run by users without root permissions. The command line
tool allows users to manage their profiles, switch generations,
build and install software through the Guix daemon. The daemon
takes care of the store, of evaluating the build expressions and
“caching” build results, and it updates the forest of symbolic
links to update profile state.</p><p>Users are finally free to conveniently manage their own software,
something they could previously only do in a crude manner by
compiling manually.</p><h2>Using a shared Guix store</h2><p>Guix is not designed to be run in a centralised manner. A Guix
daemon is supposed to run on each system as root and it listens
to RPCs from local users only. In an environment with multiple
clusters and multiple workstations this approach requires
considerable effort to make it work correctly and securely.</p><img class="full stretch" src="/images/posts/2015/guix-shared.svg" alt="Sharing Guix store and profiles" /><p>Instead we opted to run the Guix daemon on a single dedicated
server, writing profile data and store items onto an NFS share.
The cluster nodes and workstations mount this share read-only.
Although this means that users lose the ability to manage their
profiles directly on their workstations and on the cluster nodes
(because they have no local installation of the Guix client or
the Guix daemon, and because they lack write access to the shared
store), their software profiles are now available wherever they
are. To manage their profiles, users would log on to the Guix
server where they can install software into their profiles, roll
back to previous versions or send other queries to the Guix
daemon. (At some point I think it would make sense to enhance
Guix such that RPCs can be made over SSH, so that explicit
logging on to a management machine is no longer necessary.)</p><h2>Guix as a platform for scientific software</h2><p>Since winter 2014 I have been packaging software for GNU Guix,
which meanwhile has accumulated quite a few common and obscure
<a href="http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/bioinformatics.scm">bioinformatics tools and libraries</a>. A list of software
(updated daily) available through Guix is <a href="https://www.gnu.org/software/guix/package-list.html">available
here</a>. We also have common Python modules for scientific
computing, as well as programming languages such as R and Julia.</p><p>I think GNU Guix is a great platform for scientific software in
heterogeneous computing environments. The Guix project follows
the <a href="https://gnu.org/distros/free-system-distribution-guidelines.html">Free System Distribution Guidelines</a>, which mean that free
software is welcome upstream. For software that imposes
additional usage or distribution restrictions (such as when the
original Artistic license is used instead of the Clarified
Artistic license, or when commercial use is prohibited by the
license) Guix allows the use of out-of-tree package modules
through the <code>GUIX_PACKAGE_PATH</code> variable. As Guix
packages are just Scheme variables in Scheme modules, it is
trivial to extend the official GNU Guix distribution with package
modules by simply setting the <code>GUIX_PACKAGE_PATH</code>.</p><p>If you want to learn more about GNU Guix I recommend taking a
look at the excellent <a href="https://www.gnu.org/software/guix/">GNU Guix project page</a>. Feel free to contact me if you want to
learn more about packaging scientific software for Guix. It is
not difficult and we all can benefit from joining efforts in
adopting this usable, dependable, hackable, and liberating
platform for scientific computing with free software.</p><p>The Guix community is very friendly, supportive, responsive and
welcoming. I encourage you to visit the project’s <a href="https://webchat.freenode.net?channels=#guix">IRC channel #guix
on Freenode</a>, where I go by the handle “rekado”.</p>Hacking the Wavedrumhttps://elephly.net/posts/2013-08-11-hacking-the-wavedrum.htmlRicardo Wurmusrekado+web@elephly.net2013-08-11T00:00:00Z<p>The Wavedrum Oriental is a wonderful electronic instrument.
Unlike an electronic drum set or drum machines with touch
sensitive pads, this drum synthesizer’s sensors don’t merely
trigger samples. The sensors rather behave like microphones or
the pickup in an electric guitar; the signals of the four
sensors—one sensor for the drum head, one sensor on the left and
another on the right of the metal rim, and a pressure sensor in
the centre of the drum—are used to control drum synthesizer
algorithms whose output can be mixed with PCM samples. As a
result, the instrument feels a lot like a real drum, a feat that
cannot easily be achieved with devices that use simple
velocity-sensitive sample triggers.</p><p>For all its magic, the Wavedrum also has a number of flaws. Most
prominently, all editing is done through five buttons and an
endless rotary encoder. What parameters can be selected by the
buttons and thus manipulated by the encoder depends entirely on
context; one button is used to jump to the next parameter page;
simultaneously pressing a pair of buttons switches between the
two edit modes (why two?), “global” mode and “live” mode; as one
navigates through this confusing environment, a three-character
seven-segment display conjures up magic strings that occasionally
resemble abbreviations of what appear to be parameter names.
Without a copy of the manual lying next to the device, any
attempt at deciphering the cryptic three-character hints is
doomed to fail.</p><p>Another painful flaw is the lack of connectivity. There is no
way to export these precious custom programmes that were edited
with so much difficulty. There is no way to back up the
programmes, nor can one share programmes with another Wavedrum.
When the device dies, all custom patches go down with it. Or so
it seemed.</p><h2>A look inside the Wavedrum</h2><p>Not really knowing what to look for I opened up the Wavedrum in
the hopes of finding <em>something</em> that would allow me to
extend the feature set of the instrument. I first took off the
control panel. Only three screws have to be loosened to lift the
front panel and look underneath. The PCB, however, does not
offer much of interest.</p><p>Much more interesting is the main board which is located right
underneath the drum head. Removing the rim and the drum head
apparently is a permitted activity which does not void the
warranty—the manual includes instructions on how to change the
drum head.</p><img class="full stretch" src="/images/posts/2013/wavedrum-opened.jpg" alt="After removing the rim and the drum head" /><p>Once the rim and drum head are out of the way, one can already
see much of the main board, but access is denied by a transparent
plastic disk. In my instrument the plastic disk was only screwed
to two posts although there are holes for seven screws. The
heads of the two screws are covered with adhesive pads that can
easily be removed to undo the screws. (Don’t worry, the glue on
the pads is strong enough to put them back on when you’re done.)</p><warning><strong>Warning:</strong> if you are following these
instructions, at this point, I believe, you might be voiding your
warranty. If you’re careful and you don’t tell anyone, nobody
should ever notice. Note that I cannot be made responsible for any
damage to your device that may result from following these
instructions.</warning><p>With this warning out of the way, let’s move on.</p><p>The main board is very tidy making it easy to understand what’s
going on. The densely packed section on the right appears to be
power supply and rim sensor amplification logic. On the left you
can see two medium-sized chips, a bulky capacitor and something
covered with a black, textured tape. The two chips are RAM
(ESMT) and flash memory (cFeon), respectively. The big capacitor
buffers the power for the two memory chips and the massive DSP on
the back of the board. The back side of the board is rather
boring as it really only holds the DSP chip (ADSP-21375 by Analog
Devices). The board has a somewhat unusually great number of
test pads (most of which are connected to ground, used for
automated testing) and quite a few connector pads, possibly
allowing a hardware debugger to be connected to debug the DSP’s
firmware <em>in situ</em>.</p><img class="full stretch" src="/images/posts/2013/wavedrum-mainboard.jpg" alt="The mainboard of the Wavedrum Oriental" /><h2>The treasure trove</h2><p>What is underneath that black tape on the front, though? This is
where things get really interesting (well, to people like me, at
least). As I carefully removed the tape I was pleasently
surprised to see a micro SD card reader underneath. The card is
locked to the reading interface to make sure it stays in place
during operation. Unlock it by shifting the metal brace to the
right whereupon it can be lifted.</p><div class="figure"><img src="/images/posts/2013/wavedrum-card-tape.jpg" alt="The taped-over SD card" /><p class="caption">The taped-over SD card</p></div><p>The 2GB micro SD card is a standard card formatted with a FAT32
filesystem, making it possible to read it out with a standard
card reader. My netbook has a built-in SD card reader only, so I
first needed to buy an adapter to connect the micro SD card.
This reader is a little weird. It seems that the adapter must be
in the reader slot on boot or the micro SD card won’t be
recognised. (If you’re unsure whether the card is recognised by
your system check the output of <code>dmesg</code>.) Eventually,
the card was recognised as <code>/dev/sdb1</code>. (<code>/dev/sdb</code> is the SD card reader device itself.) As this is my
only Wavedrum and I intend to use it for years to come I decided
to be especially careful this time and only operate on a <em>copy</em> of the card. The Wavedrum’s card reader is perfectly
capable of reading 8GB micro SD HC cards, so if you want to play
with the data on the card I recommend mirroring the original card
image onto whatever micro SD card you have at your disposal and
play with that instead of the original card. To create a block
level copy of the card I <em>did not</em> mount the filesystem and
simply executed the following command:</p><pre><code>dd if=/dev/sdb1 of=wavedrum.img</code></pre><p>This instructs <code>dd</code> to copy all blocks from the input
device file (<code>if</code>, i.e. <code>/dev/sdb1</code>) to the
output file (<code>of</code>) of the name <code>wavedrum.img</code>.
Dependent on the number of disks on your system, the input device
file may have a different name. Check the output of <code>dmesg | tail</code> as you connect the card reader to see which
device node is created for the micro SD card. Note that this
blindly copies <em>everything</em> on the micro SD card, not just
files that are available through the FAT32 filesystem. Hence,
the size of the image is quite a bit larger than the sum of all
files on the mounted image (502,145,536 bytes vs 234,479,878
bytes).</p><p>Before continuing, please put the micro SD card back into the
Wavedrum’s card reader and lock it to prevent it from being
damaged (things can get messy, you know). Going forward, we only
need to mount the image to access the data stored on the card.
Run the following as root to mount the card image as a read-only
filesystem:</p><pre><code>mkdir wavedrum
mount -o loop,ro wavedrum.img wavedrum/</code></pre><p>Let’s take a look at the files on the card:</p><ul class="tree"><li><span class="NORM">/</span></li><li>├── ( 16) <span class="EXEC">CALIB.BOR</span></li><li>├── ( 16K) <span class="DIR">Factory</span></li><li>│ ├── (192K) <span class="EXEC">F_INFO.BOR</span></li><li>│ ├── ( 57K) <span class="EXEC">F_INST_H.BOR</span></li><li>│ ├── ( 57K) <span class="EXEC">F_INST_R.BOR</span></li><li>│ ├── ( 16K) <span class="EXEC">F_PROG.BOR</span></li><li>│ └── ( 88) <span class="EXEC">F_USER.BOR</span></li><li>├── ( 57K) <span class="EXEC">INST_H.BOR</span></li><li>├── ( 57K) <span class="EXEC">INST_R.BOR</span></li><li>├── ( 16K) <span class="DIR">LOOP</span></li><li>│ ├── (744K) <span class="EXEC">LOOP0001.BIN</span></li><li>│ ├── (402K) <span class="EXEC">LOOP0002.BIN</span></li><li>│ ├── (750K) <span class="EXEC">LOOP0003.BIN</span></li><li>...</li><li>│ ├── (173K) <span class="EXEC">LOOP0138.BIN</span></li><li>│ ├── (173K) <span class="EXEC">LOOP0139.BIN</span></li><li>│ └── (234K) <span class="EXEC">LOOP0140.BIN</span></li><li>├── ( 16K) <span class="EXEC">PRE_PROG.BOR</span></li><li>├── ( 16K) <span class="DIR">SYSTEM</span></li><li>│ ├── ( 16) <span class="EXEC">VERSION.INF</span></li><li>│ ├── (1.0M) <span class="EXEC">WDORM202.BIN</span></li><li>│ └── (8.0K) <span class="EXEC">WDORS110.BIN</span></li><li>├── ( 88) <span class="EXEC">USER.BOR</span></li><li>├── (157M) <span class="EXEC">WD2_DATA.BOR</span></li><li>├── (192K) <span class="EXEC">WD2_INFO.BOR</span></li><li>└── ( 16K) <span class="EXEC">WD2_PROG.BOR</span></li></ul><p>The files in the <code>Factory</code> directory contain
initialisation data. When a factory reset is performed, the
customised versions of these files in the root directory are
overwritten with the versions contained in the <code>Factory</code>
directory. All initial programmes that come with the Wavedrum
are stored in <code>Factory/F_PROG.BOR</code>; once programmes have
been edited <code>WD2_PROG.BOR</code> in the root directory will
differ from <code>Factory/F_PROG.BOR</code>. (More about the nature
of these differences later.) <code>PRE_PROG.BOR</code> is the same
as <code>Factory/F_PROG.BOR</code> and is probably used to make the
original factory presets available in addition to custom
programmes, starting at position <code>P.00</code>, the programme
slot after <code>149</code>.</p><p>The initial mapping of presets to any of the 12 slots (3 banks
with 4 slots each) is stored in <code>Factory/F_USER.BOR</code>.
Initially, <code>USER.BOR</code> in the root directory will be
identical to this file. The format of this file is rather
simple:</p><pre><code>00000000 | 00 00 00 64 00 00 00 67 00 00 00 7b 00 00 00 6c
00000010 | 00 00 00 65 00 00 00 68 00 00 00 71 00 00 00 7a
00000020 | 00 00 00 84 00 00 00 8c 00 00 00 8b 00 00 00 95
00000030 | 00 00 00 00 00 00 00 00 00 00 00 75 00 00 00 26
00000040 | 00 00 00 07 00 00 00 14 00 00 00 07 00 00 00 14
00000050 | 00 00 00 05 00 00 00 64</code></pre><p>Every 8 digit block (4 byte) is used for one slot. We can see
that the first slot in bank A is set to programme 100 (0x64 hex),
the second to programme 103 (0x67 hex) and so on. As the
Wavedrum only allows for 12 slots to store programme identifiers,
only the first 48 bytes are used for programmes. The remaining
40 bytes (starting at 0x30) are used for global parameters that
can be adjusted in “global” editing mode. The global parameters
are stored in this order:</p><ul><li>delay pan</li><li>aux input level</li><li>loop phrase select</li><li>loop play mode (off=38)</li><li>head sensor threshold</li><li>head sensor sensitivity</li><li>rim sensor threshold</li><li>rim sensor sensitivity</li><li>pressure sensor threshold</li><li>pressure maximum</li></ul><p>I don’t know yet what purpose <code>F_INST_H.BOR</code> and <code>F_INST_R.BOR</code> serve, but it is clear that the former relates to
settings for the drum head while the latter contains similar
settings for the rim. Even after editing a few programmes,
<code>INST_H.BOR</code> and <code>INST_R.BOR</code> in the root
directory were still identical to their counterparts in the
<code>Factory</code> directory.</p><p>The <code>CALIB.BOR</code> appears to contain calibration
information for the head and rim sensors. This is different from
the calibration performed by adjusting the four global paramaters
for sensor threshold and sensitivity. I have not been able to
edit these settings through the Wavedrum so far, so these
probably are factory settings.</p><h2>Audio data</h2><p>All files in the <code>LOOP</code> directory as well as <code>WD2_DATA.BOR</code> contain raw audio data. Unfortunately, I haven’t
quite figured out the format yet, but you can listen to the
clearly recognisable loop patterns with <code>play</code> (part of
the <a href="http://sox.sourceforge.net">SoX</a> applications):</p><pre><code>find LOOP -name "*.BIN" -print |\
xargs -I XXX \
play -t raw -r 48k -b 16 -e signed-integer -c 1 XXX</code></pre><p>Obviously, this isn’t quite correct. I’m interpreting every 16
bits as a sample in signed integer format, but the sound is
distorted and far from the realistic instrument sound when
playing back the loops through the Wavedrum.</p><p>All loops start with this 44 byte long header:</p><pre><code>04 dc 10 d3 uU vV 5W 95 01 d4 00 d0 30 f8 22 b5
46 95 56 95 57 95 57 95 d6 2e 56 95 56 e2 57 95
54 95 46 95 32 f4 22 f4 xX yY 5W 95</code></pre><p>With a few exceptions (namely 0009, 0025, 0027, 0030, 0033, 0036,
0049, 0054, 0064, 0082, 0091, 0103, 0104, 0107, 0108, 0127, 0128,
0129, 0130, 0131, 0132, 0135), vV equals yY in most loops. It
seems that loops with the same number of bytes have the exact
same numbers for uU, vV, W, xX, and yY. This is especially
apparent in the loops 0127 to 0132 (inclusive), which are all
192,010 bytes long and all have the values 54:7b:54 for uU:vV:5W
and 88:78:54 for xX:yY:5W.</p><p>Clearly, more work is required to figure out the complete format
of these loop files. Once this is understood we could use custom
loops with the Wavedrum.</p><p>The raw audio data in <code>WD2_DATA.BOR</code> suffers from the
same problems. Although the data can be interpreted as raw
audio, the sound is distorted and playback is unnaturally fast.</p><h2>System files</h2><p>I don’t know what <code>SYSTEM/WDORS110.BIN</code> is used for. The
only useful string contained in the file is “BOOTABLE”. Your
guess is as good as mine as to what it does.</p><p><code>SYSTEM/VERSION.INF</code> is only 16 bytes short and pretty
boring as it contains just what the name implies: version
numbers.</p><pre><code>02 02 01 10 02 02 00 00 57 44 4f 52 00 00 00 00</code></pre><p>This string of numbers is interpreted as follows: firmware
version 2.02, sub-version 1.10, data version 2.02 (followed by
two empty bytes); 57 44 4f 52 (hex for “WDOR”) stands for
“Wavedrum Oriental” (followed by four empty bytes). You can have
the Wavedrum display all its version numbers by pressing the
button labelled “Global” when powering on the device. Note that
the file name <code>WDORS110.BIN</code> references the version
number 1.10, while <code>WDORM202.BIN</code> references the firmware
version number 2.02.</p><p><code>SYSTEM/WDORM202.BIN</code> contains the firmware of the
Wavedrum Oriental. There are many interesting strings and binary
patterns in the file, but I’m still a long way from <em>understanding</em> how it works. To view the strings with the
<code>strings</code> command, you have to specify the encoding as
32-bit little endian:</p><pre><code>strings --encoding L SYSTEM/WDORM202.BIN</code></pre><p>Some of the strings embedded in the firmware are file names, some
of which are not available on the micro SD card. This includes
the following files: SYS00000.BIN, SYS00100.BIN,
SYSTEM/WDORS100.BIN, SYSTEM/WDX_M100.BIN, SYSTEM/WDX_S100.BIN,
and SUBXXXXX.BIN (a pattern?).</p><h2>The programme format</h2><p>Looking at the hexdump of the file <code>WD2_PROG.BOR</code> which
holds all custom presets, I couldn’t find any obvious patterns in
the file, so I resorted to editing a single programme, setting
particular consecutive parameters to easily recognisable
sequences of values (such as 100, 99, 98, and 97 for hd1, hd2,
hd3, and hd4) and locating the changes in the hexdump.</p><img class="full stretch" src="/images/posts/2013/wavedrum-diff.png" alt="Analysing the programme format by changing values and looking at the differences" /><p>This procedure has allowed me to figure out in what order the
parameters are stored in the file. Each programme is exactly 54
16-bit words long; each parameter takes up exactly 16 bits.
Negative values are stored in <a href="https://en.wikipedia.org/wiki/Two%27s_complement">two’s
complement</a> format (e.g. negative six is stored as 0xFFFA). The
file is exactly 16200 bytes long which is just enough to hold 150
custom programmes, each taking up 108 bytes.</p><p>I’m currently writing a Haskell library to parse / build
progammes and parameters. The code is available for <a href="https://www.fsf.org/about/what-is-free-software">free</a> <a href="http://git.elephly.net/wavedrum/wavedrum-lib.git">here</a> under
the <a href="https://gnu.org/licenses/gpl.html">GNU GPLv3</a>.</p><p>The parameters are stored in this order:</p><table><thead><tr class="header"><th align="right">identifier</th><th align="left">mode</th><th align="left">target</th><th align="left">name</th></tr></thead><tbody><tr><td align="right">07.1</td><td align="left">Edit 1</td><td align="left">head algorithm</td><td align="left">Pressure curve</td></tr><tr><td align="right">type</td><td align="left">Edit 2</td><td align="left">–</td><td align="left">Pre EQ</td></tr><tr><td align="right">01.1</td><td align="left">Edit 1</td><td align="left">head algorithm</td><td align="left">Tune</td></tr><tr><td align="right">02.1</td><td align="left">Edit 1</td><td align="left">head algorithm</td><td align="left">Decay</td></tr><tr><td align="right">03.1</td><td align="left">Edit 1</td><td align="left">head algorithm</td><td align="left">Level</td></tr><tr><td align="right">04.1</td><td align="left">Edit 1</td><td align="left">head algorithm</td><td align="left">Pan</td></tr><tr><td align="right">05.1</td><td align="left">Edit 1</td><td align="left">head algorithm</td><td align="left">Algorithm select</td></tr><tr><td align="right">hd.1</td><td align="left">Edit 2</td><td align="left">head algorithm</td><td align="left">Algorithm parameter 1</td></tr><tr><td align="right">hd.2</td><td align="left">Edit 2</td><td align="left">head algorithm</td><td align="left">Algorithm parameter 2</td></tr><tr><td align="right">hd.3</td><td align="left">Edit 2</td><td align="left">head algorithm</td><td align="left">Algorithm parameter 3</td></tr><tr><td align="right">hd.4</td><td align="left">Edit 2</td><td align="left">head algorithm</td><td align="left">Algorithm parameter 4</td></tr><tr><td align="right">hd.5</td><td align="left">Edit 2</td><td align="left">head algorithm</td><td align="left">Algorithm parameter 5</td></tr><tr><td align="right">hd.6</td><td align="left">Edit 2</td><td align="left">head algorithm</td><td align="left">Algorithm parameter 6</td></tr><tr><td align="right">hd.7</td><td align="left">Edit 2</td><td align="left">head algorithm</td><td align="left">Algorithm parameter 7</td></tr><tr><td align="right">hd.8</td><td align="left">Edit 2</td><td align="left">head algorithm</td><td align="left">Algorithm parameter 8</td></tr><tr><td align="right">01.3</td><td align="left">Edit 1</td><td align="left">rim algorithm</td><td align="left">Tune</td></tr><tr><td align="right">02.3</td><td align="left">Edit 1</td><td align="left">rim algorithm</td><td align="left">Decay</td></tr><tr><td align="right">03.3</td><td align="left">Edit 1</td><td align="left">rim algorithm</td><td align="left">Level</td></tr><tr><td align="right">04.3</td><td align="left">Edit 1</td><td align="left">rim algorithm</td><td align="left">Pan</td></tr><tr><td align="right">05.3</td><td align="left">Edit 1</td><td align="left">rim algorithm</td><td align="left">Algorithm select</td></tr><tr><td align="right">rm.1</td><td align="left">Edit 2</td><td align="left">rim algorithm</td><td align="left">Algorithm parameter 1</td></tr><tr><td align="right">rm.2</td><td align="left">Edit 2</td><td align="left">rim algorithm</td><td align="left">Algorithm parameter 2</td></tr><tr><td align="right">rm.3</td><td align="left">Edit 2</td><td align="left">rim algorithm</td><td align="left">Algorithm parameter 3</td></tr><tr><td align="right">rm.4</td><td align="left">Edit 2</td><td align="left">rim algorithm</td><td align="left">Algorithm parameter 4</td></tr><tr><td align="right">rm.5</td><td align="left">Edit 2</td><td align="left">rim algorithm</td><td align="left">Algorithm parameter 5</td></tr><tr><td align="right">rm.6</td><td align="left">Edit 2</td><td align="left">rim algorithm</td><td align="left">Algorithm parameter 6</td></tr><tr><td align="right">rm.7</td><td align="left">Edit 2</td><td align="left">rim algorithm</td><td align="left">Algorithm parameter 7</td></tr><tr><td align="right">rm.8</td><td align="left">Edit 2</td><td align="left">rim algorithm</td><td align="left">Algorithm parameter 8</td></tr><tr><td align="right">01.2</td><td align="left">Edit 1</td><td align="left">head PCM instrument</td><td align="left">Tune</td></tr><tr><td align="right">02.2</td><td align="left">Edit 1</td><td align="left">head PCM instrument</td><td align="left">Decay</td></tr><tr><td align="right">03.2</td><td align="left">Edit 1</td><td align="left">head PCM instrument</td><td align="left">Level</td></tr><tr><td align="right">04.2</td><td align="left">Edit 1</td><td align="left">head PCM instrument</td><td align="left">Pan</td></tr><tr><td align="right">05.2</td><td align="left">Edit 1</td><td align="left">head PCM instrument</td><td align="left">PCM instrument select</td></tr><tr><td align="right">06.2</td><td align="left">Edit 1</td><td align="left">head PCM instrument</td><td align="left">Velocity curve</td></tr><tr><td align="right">07.2</td><td align="left">Edit 1</td><td align="left">head PCM instrument</td><td align="left">Pressure curve</td></tr><tr><td align="right">08.2</td><td align="left">Edit 1</td><td align="left">head PCM instrument</td><td align="left">Pressure tune</td></tr><tr><td align="right">09.2</td><td align="left">Edit 1</td><td align="left">head PCM instrument</td><td align="left">Pressure decay</td></tr><tr><td align="right">01.4</td><td align="left">Edit 1</td><td align="left">rim PCM instrument</td><td align="left">Tune</td></tr><tr><td align="right">02.4</td><td align="left">Edit 1</td><td align="left">rim PCM instrument</td><td align="left">Decay</td></tr><tr><td align="right">03.4</td><td align="left">Edit 1</td><td align="left">rim PCM instrument</td><td align="left">Level</td></tr><tr><td align="right">04.4</td><td align="left">Edit 1</td><td align="left">rim PCM instrument</td><td align="left">Pan</td></tr><tr><td align="right">05.4</td><td align="left">Edit 1</td><td align="left">rim PCM instrument</td><td align="left">PCM instrument select</td></tr><tr><td align="right">06.4</td><td align="left">Edit 1</td><td align="left">rim PCM instrument</td><td align="left">Velocity curve</td></tr><tr><td align="right">07.4</td><td align="left">Edit 1</td><td align="left">rim PCM instrument</td><td align="left">Pressure curve</td></tr><tr><td align="right">08.4</td><td align="left">Edit 1</td><td align="left">rim PCM instrument</td><td align="left">Pressure tune</td></tr><tr><td align="right">09.4</td><td align="left">Edit 1</td><td align="left">rim PCM instrument</td><td align="left">Pressure decay</td></tr><tr><td align="right">10.1</td><td align="left">Edit 1</td><td align="left">–</td><td align="left">Reverb type</td></tr><tr><td align="right">10.2</td><td align="left">Edit 1</td><td align="left">–</td><td align="left">Reverb effect level</td></tr><tr><td align="right">10.3</td><td align="left">Edit 1</td><td align="left">–</td><td align="left">Reverb decay time</td></tr><tr><td align="right">10.4</td><td align="left">Edit 1</td><td align="left">–</td><td align="left">Reverb frequency damping</td></tr><tr><td align="right">11.3</td><td align="left">Edit 1</td><td align="left">–</td><td align="left">Delay feedback</td></tr><tr><td align="right">11.2</td><td align="left">Edit 1</td><td align="left">–</td><td align="left">Delay effect level</td></tr><tr><td align="right">11.1</td><td align="left">Edit 1</td><td align="left">–</td><td align="left">Delay time</td></tr><tr><td align="right">11.4</td><td align="left">Edit 1</td><td align="left">–</td><td align="left">Delay frequency damping</td></tr></tbody></table><h2>Thanks</h2><p>The following tools have proven indispensable in the analysis:</p><ul><li><a href="http://www.isthe.com/chongo/tech/comp/calc/">calc</a>, a
calculator for the command line supporting hexadecimal
representations of numbers (both input and output using the
<code>base</code> and <code>base2</code> functions)</li><li><a href="http://www.cjmweb.net/vbindiff/">vbindiff</a>, a tool to
visualise differences between two binary files with a
split-screen hexadecimal display</li><li><a href="https://wiki.gnome.org/Ghex">ghex</a>, a simple
hexadecimal editor supporting pattern search and highlighting</li></ul><h2>Call for help</h2><p>If you own an earlier model of the Wavedrum or the latest
Wavedrum Global Edition, I would be <em>very</em> happy if you
could send me a block-level copy of the SD card (see above for
instructions). This would allow me to understand the firmware
better and maybe even make it possible to upgrade an older
Wavedrum to the latest version (taking into account possible
hardware differences, such as differing memory size).</p><p>Please send a link to the micro SD card image to <span class="obfuscated">sflbepAfmfqimz/ofu</span>.</p><p>Read <a href="/tags/wavedrum.html">more posts about the Wavedrum
here</a>.</p>