Building open source Swift on ARMv7

Apple recently open sourced their Swift language and compiler. I'm pretty excited about this (I'm a huge Swift fan), and I wanted a project to help motivate me to dig in. I also do a lot of work with embedded linux at work, so I'm eager to use this great language there, too. I found an existing feature request in the issue tracker that Apple set up for swift. I was pleased that they had already given it 'medium' priority, but I know that if you want something to happen sooner than later, you should just start working on it.

I've done enough linux work that I know that if you start deviating from the officially supported distro. you start making work for yourself pretty quickly. Therefore, I decided to limit myself to only ARMv7 systems that have an Ubuntu 14.04 image. In my case that was a BeagleBone Black (BBB) and an Nvidia Tegra TK1. If you can, go with the Nvidia every time (The TX1 has 3GB of RAM). It's much faster, and has many times the BBB's 512MB of RAM. Whatever amount of RAM you have, you'll need to setup some swap space. Also, depending on whether you're using a SD card, you might need to set aside at least 8GB of space to store the source and build products.

utils/build-scrip

To begin with, we need to understand the build process. Apple's documentation makes it plain that you should use the utils/build-script script to build, and that any processes outside of that system aren't supported:

For all automated build environments, this tool is regarded as the only way to build Swift. This is not a technical limitation of the Swift build system. It is a policy decision aimed at making the builds uniform across all environments and easily reproducible by engineers who are not familiar with the details of the setups of other systems or automated environments.

This script is responsible for parsing the arguments presented on the command line, printing help text, making build directories, and a prepares things for utils/build-script-impl. A read through yields no information about architectures at all, so we move on to build-script-impl.

utils/build-script-impl

Most of the real work occurs in build-script-impl. Right away, we're presented with architecture specific parts.

LLVM_TARGETS_TO_BUILD="X86;ARM;AArch64"

This line sets a variable that will eventually tell the LLVM build process (several packages need to build to support Swift; one of those is LLVM, another is Clang) which architectures it should be able to generate machine code for.

This script is a little hard to follow because it's well over 1000 LOC, and there are functions mixed in with top level code. In this section, I'll trace the execution of the file rather than file order. I'll also omit anything that doesn't have a bearing on target/host architecture.

# A list of deployment targets to compile the Swift host tools for, in cases
# where we can run the resulting binaries natively on the build machine.
NATIVE_TOOLS_DEPLOYMENT_TARGETS=()
# A list of deployment targets to cross-compile the Swift host tools for.
# We can't run the resulting binaries on the build machine.
CROSS_TOOLS_DEPLOYMENT_TARGETS=()
# Determine the native deployment target for the build machine, that will be
# used to jumpstart the standard library build when cross-compiling.
case "$(uname -s -m)" in
        Linux\ x86_64)
            NATIVE_TOOLS_DEPLOYMENT_TARGETS=("linux-x86_64")
            ;;        Linux\ armv7*)
            NATIVE_TOOLS_DEPLOYMENT_TARGETS=("linux-armv7")
            ;;        Linux\ aarch64)
            NATIVE_TOOLS_DEPLOYMENT_TARGETS=("linux-aarch64")
            ;;        Darwin\ x86_64)
            NATIVE_TOOLS_DEPLOYMENT_TARGETS=("macosx-x86_64")
            ;;        FreeBSD\ x86_64)
            NATIVE_TOOLS_DEPLOYMENT_TARGETS=("freebsd-x86_64")
            ;;
        *)
            echo "Unknown operating system"
            exit 1
            ;;
    esac    

This section of code determines what the native platform is, both in terms of OS and architecture. The native tools deployment targets is an array of native targets. This could be something like building for multiple arm targets that (may) be binary compatible, like different hard(or soft)-float versions. The case block matches on the results of uname -s -m, which on most ARM platforms will return something like this: Linux armv7l. You'll almost always see the l after armv7. We don't want to be that specific, so we'll match just the armv7 part. The cross tools deployment targets variable is used to specify the cross compilers that should be built. However, we can't just put linux-armv7 into this list to build a cross compiler, not yet anyway:

# Sanitize the list of cross-compilation targets.
    for t in ${CROSS_COMPILE_TOOLS_DEPLOYMENT_TARGETS} ; do
        case ${t} in
            iphonesimulator-i386 | iphonesimulator-x86_64 | \
            iphoneos-arm64 | iphoneos-armv7 | \
            appletvos-arm64 | appletvsimulator-x86_64 | \
            watchos-armv7k | watchsimulator-i386)
                CROSS_TOOLS_DEPLOYMENT_TARGETS=(
                    "${CROSS_TOOLS_DEPLOYMENT_TARGETS[@]}"
                    "${t}"
                )
                ;;
            *)
                echo "Unknown deployment target"
                exit 1
                ;;
        esac
    done    

This code will trap on any entry in that array that isn't in the list above. It would probably be possible to add linux targets there eventually, but for now let's move on. I want a native compiler for ARM, not just a cross compiler. The next thing that we come across is the determination of the platform for which we want the stdlib built for. This case, like the prior one, matches on armv7, and ignores the suffix.

# A list of deployment targets that we compile or cross-compile the
# Swift standard library for.
    STDLIB_DEPLOYMENT_TARGETS=()
    case "$(uname -s -m)" in
        Linux\ x86_64)
            STDLIB_DEPLOYMENT_TARGETS=("linux-x86_64")
            ;;
        Linux\ armv7*)
            STDLIB_DEPLOYMENT_TARGETS=("linux-armv7")
            ;;
        Linux\ aarch64)
            STDLIB_DEPLOYMENT_TARGETS=("linux-aarch64")
            ;;
        Darwin\ x86_64)
            STDLIB_DEPLOYMENT_TARGETS=(
                "macosx-x86_64"
                "iphonesimulator-i386"
                "iphonesimulator-x86_64"
                "appletvsimulator-x86_64"
                "watchsimulator-i386"
                # Put iOS native targets last so that we test them last
                # (it takes a long time).
                "iphoneos-arm64"
                "iphoneos-armv7"
                "appletvos-arm64"
                "watchos-armv7k"
            )
            ;;
        FreeBSD\ x86_64)
            STDLIB_DEPLOYMENT_TARGETS=("freebsd-x86_64")
            ;;
        *)
            echo "Unknown operating system"
            exit 1
            ;;
    esac    

After skipping lots more functions, we come across the meat of the process.

#
# Configure and build each product
#
# Start with native deployment targets because the resulting tools 
# are used during cross-compilation.
for deployment_target in "${NATIVE_TOOLS_DEPLOYMENT_TARGETS[@]}" \
                         "${CROSS_TOOLS_DEPLOYMENT_TARGETS[@]}"; do
    set_deployment_target_based_options

    # ~~~ skipped compile option flag management ~~~

    # Build.
done

function set_deployment_target_based_options() {
    llvm_cmake_options=()
    swift_cmake_options=()
    cmark_cmake_options=()
    swiftpm_bootstrap_options=()

    case $deployment_target in
        linux-x86_64)
            SWIFT_HOST_VARIANT_ARCH="x86_64"
            ;;
        linux-armv7)
            SWIFT_HOST_VARIANT_ARCH="armv7"
            ;;
        linux-aarch64)
            SWIFT_HOST_VARIANT_ARCH="aarch64"
            ;;
        freebsd-x86_64)
            SWIFT_HOST_VARIANT_ARCH="x86_64"
            ;;
        macosx-* | iphoneos-* | iphonesimulator-* | \
          appletvos-* | appletvsimulator-* | \
            watchos-* | watchsimulator-*)
            # ~~~ snipped tons of cross-compile for Apple devices stuff ~~~
        *)
            echo "Unknown compiler deployment target: $deployment_target"
            exit 1
            ;;
    esac
}

The important thing here is that the build process runs in full for each native target and cross target. At the beginning of the process the set_deployment_target_based_options function runs, and it sets the host variant for Swift. Remember at this point, its target variant, not host variant.

The rest of the script does the work of testing, packaging, and installing.

CMakeLists.txt

The CMakeLists.txt file provides CMake instructions on how to actually build the tools using the flags and options determined by build-script and build-script-impl. I am far less than a novice in all things CMake, so please correct me if I make mistakes in this area, but this is what I 'm able to infer from what's happening.

Within CMakeLists.txt , we have to manually set CMAKE_SYSTEM_PROCESSOR unless the build is cross-compiled, or otherwise already set. In linux (andFreeBSD) it's set to the value of uname -m; it's just set to i386.

# Reset CMAKE_SYSTEM_PROCESSOR if not cross-compiling.
# CMake refuses to use `uname -m` on OS X
# http://public.kitware.com/Bug/view.phpid=10326
if(NOT CMAKE_CROSSCOMPILING AND CMAKE_SYSTEM_PROCESSOR STREQUAL "i386")
  execute_process(
    COMMAND "uname" "-m"
    OUTPUT_VARIABLE CMAKE_SYSTEM_PROCESSOR
    OUTPUT_STRIP_TRAILING_WHITESPACE)
endif()    

Later, we use the value of CMAKE_SYSTEM_PROCESSOR to determine what architecture to build. Remember that if we're not cross-compiling, its value is the host architecture. Also, note that cross-compiling is only available on Darwin (macosx):

# FIXME: separate the notions of SDKs used for compiler tools and target
# binaries.
if("${CMAKE_SYSTEM_NAME}" STREQUAL "Linux")
  set(CMAKE_EXECUTABLE_FORMAT "ELF")

  set(SWIFT_HOST_VARIANT "linux" CACHE STRING
      "Deployment OS for Swift host tools (the compiler) [linux].")

  set(SWIFT_HOST_VARIANT_SDK "LINUX")
  set(SWIFT_PRIMARY_VARIANT_SDK_default "LINUX")

  # FIXME: This will not work while trying to cross-compile.
  if("${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "x86_64")
    configure_sdk_unix(LINUX "Linux" "linux" "linux" 
      "x86_64" "x86_64-unknown-linux-gnu")
    set(SWIFT_HOST_VARIANT_ARCH "x86_64")
    set(SWIFT_PRIMARY_VARIANT_ARCH_default "x86_64")
  # FIXME: This only matches ARMv7l (by far the most common variant).
  elseif("${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "armv7l")
    configure_sdk_unix(LINUX "Linux" "linux" "linux" 
      "armv7" "armv7-unknown-linux-gnueabihf")
    set(SWIFT_HOST_VARIANT_ARCH "armv7")
    set(SWIFT_PRIMARY_VARIANT_ARCH_default "armv7")
  elseif("${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "aarch64")
    configure_sdk_unix(LINUX "Linux" "linux" "linux" 
      "aarch64" "aarch64-unknown-linux-gnu")
    set(SWIFT_HOST_VARIANT_ARCH "aarch64")
    set(SWIFT_PRIMARY_VARIANT_ARCH_default "aarch64")
  else()
    message(FATAL_ERROR "Unknown or unsupported architecture: ${CMAKE_SYSTEM_PROCESSOR}")
  endif()
# ~~~ snipped FreeBSD and Darwin branches ~~~

To make sense of this we really need to dig into the implementation of the configure_sdk_unix macro in SwiftConfigureSDK.cmake, otherwise the arguments are just a bunch of "linux" strings with an arch. and a triple.

macro(configure_sdk_unix
  prefix name lib_subdir triple_name arch triple)
  # Note: this has to be implemented as a macro because it sets global
  # variables.

  set(SWIFT_SDK_${prefix}_NAME "${name}")
  set(SWIFT_SDK_${prefix}_PATH "/")
  set(SWIFT_SDK_${prefix}_VERSION "don't use")
  set(SWIFT_SDK_${prefix}_BUILD_NUMBER "don't use")
  set(SWIFT_SDK_${prefix}_DEPLOYMENT_VERSION "don't use")
  set(SWIFT_SDK_${prefix}_LIB_SUBDIR "${lib_subdir}")
  set(SWIFT_SDK_${prefix}_VERSION_MIN_NAME "")
  set(SWIFT_SDK_${prefix}_TRIPLE_NAME "${triple_name}")
  set(SWIFT_SDK_${prefix}_ARCHITECTURES "${arch}")

  set(SWIFT_SDK_${prefix}_ARCH_${arch}_TRIPLE "${triple}")

  # Add this to the list of known SDKs.
  list(APPEND SWIFT_CONFIGURED_SDKS "${prefix}")

  _report_sdk("${prefix}")
endmacro()

This macro sets a number of CMake variables that will be used in both the generation of the swift compiler as well as the standard lib. An interesting thing to note is that the 4th linux should probably be linux-x86_64 and linux-arm.

While building swift files, these variables are ultimately consumed in the cmake/modules/AddSwift file for a variety of tasks, such as finding the path to the libraries, setting sdk and target flags, etc. Not that in this module, what was ${prefix} becomes ${sdk}.

function(_add_variant_swift_compile_flags
    # … snip …
    list(APPEND result
        "-sdk" "${SWIFT_SDK_${sdk}_PATH}"
        "-target" "${SWIFT_SDK_${sdk}_ARCH_${arch}_TRIPLE}")
    # … snipped stuff about optimization, etc. …
endfunction()

function(_add_variant_link_flags
    # … snip …
    _add_variant_c_compile_link_flags(
        "${sdk}"
        "${arch}"
        "${build_type}"
        "${enable_assertions}"
        result)
        
    if("${sdk}" STREQUAL "LINUX")
        list(APPEND result "-lpthread" "-ldl")
    elseif("${sdk}" STREQUAL "FREEBSD")
        # No extra libraries required.
    else()
        list(APPEND result "-lobjc")
    endif()
    # … snip …
endfunction()    

In cases where it's not obvious where control flow leaves you, especially with how CMake seems to make liberal use of globals, I find it helpful to grep for variable names. Doing so for SWIFT_SDK_${ led me to...

stdlib/public/runtime/CMakeLists.txt

Where I found this gem!

foreach(sdk ${SWIFT_CONFIGURED_SDKS})
  if("${sdk}" STREQUAL "LINUX" OR "${sdk}" STREQUAL "FREEBSD")
    foreach(arch ${SWIFT_SDK_${sdk}_ARCHITECTURES})
      set(arch_subdir "${SWIFT_SDK_${sdk}_LIB_SUBDIR}/${arch}")

      # FIXME: We will need a different linker script for 32-bit builds.
      configure_file(
          "swift.ld" "${SWIFTLIB_DIR}/${arch_subdir}/swift.ld" COPYONLY)

      swift_install_in_component(compiler
          FILES "swift.ld"
          DESTINATION "lib/swift/${arch_subdir}")

    endforeach()
  endif()
endforeach()

This FIXME is a big deal, because I hadn't found it before. I was having linker issues with my binary, and this holds some promise helping address that issue. I'll go ahead and add a check here, copy the 64-bit script and edit it for 32 bit.

foreach(sdk ${SWIFT_CONFIGURED_SDKS})
  if("${sdk}" STREQUAL "LINUX" OR "${sdk}" STREQUAL "FREEBSD")
    foreach(arch ${SWIFT_SDK_${sdk}_ARCHITECTURES})
      set(arch_subdir "${SWIFT_SDK_${sdk}_LIB_SUBDIR}/${arch}")

      if("${arch}" STREQUAL "arm")
        configure_file(
            "swift_32.ld" "${SWIFTLIB_DIR}/${arch_subdir}/swift.ld" COPYONLY)
      else()
        configure_file(
            "swift_64.ld" "${SWIFTLIB_DIR}/${arch_subdir}/swift.ld" COPYONLY)
      endif()

      swift_install_in_component(compiler
          FILES "swift.ld"
          DESTINATION "lib/swift/${arch_subdir}")

    endforeach()
  endif()
endforeach()

The 64-bit ld script is pretty small, and doesn't seem too intimidating (I've never seen an ld script before):

SECTIONS
{
  .swift2_protocol_conformances :
  {
    .swift2_protocol_conformances_start = . ;
    QUAD(SIZEOF(.swift2_protocol_conformances) - 8) ;
    *(.swift2_protocol_conformances) ;
  }
}
INSERT AFTER .dtors

The only problem is that it's far from obvious what needs to change when porting that to a 32-bit system!! My only guess is that 8*8 is 64-bits, so maybe that -8 needs to be a -4! :)

SECTIONS
{
  .swift2_protocol_conformances :
  {
    .swift2_protocol_conformances_start = . ;
    QUAD(SIZEOF(.swift2_protocol_conformances) - 4) ;
    *(.swift2_protocol_conformances) ;
  }
}
INSERT AFTER .dtors

Unfortunately, that got me nowhere. But, I'm getting ahead of myself. Moving on...

stdlib/public/SwiftShims/LibcShims.h

This one is an easy one… All we have to do is make a new typedef for __swift_ssize_t for the 32-bit arm:

#if defined(__linux__) && defined (__arm__)
typedef      int __swift_ssize_t;
#else
 typedef long int __swift_ssize_t;
#endif

stdlib/public/stubs/Stubs.cpp

Next, we have to deal with the fact that a multiply with overflow is missing from what libgcc provides us. I'll leave this out for brevity, but basically I just copied an implementation from the compiler-rt project.

lib/Driver/ToolChains.cpp

ToolChains.cpp is an interesting one because it is responsible for linking the compiler pieces together when building swift applications. In the original version, the x86_64 path to the swift stdlib was hard-coded in there. Now, we just have to query the target triple, and select the correct path to the library.

- Arguments.push_back(		+    
-   context.Args.MakeArgString(Twine(RuntimeLibPath) + "/x86_64/swift.ld"));

// ~ becomes ~

+ Arguments.push_back(context.Args.MakeArgString(
+   Twine(RuntimeLibPath) + "/" + getTriple().getArchName() + "/swift.ld"));

Testing files

There were several files that affect the testing suite. This post is already way to long, so I'll leave those out. If you want to see all the changes I've made, checkout the commit:

Summary

This port is getting pretty close. My 32-bit linker script doesn't fix anything, and I'm still looking for other potential problems. But, for now I'm happy with the progress. I hope this helps anyone else exploring other ports. The 32-bit arm port is potentially the simplest possible one, because the older iPhones and iWatch use 32-bit arm chips.

Update 12/16/2015

I was able to fix the linking problem by looking at another exciting swift port (SwiftAndroid), and noticing that -Bsymbolc is being used in the AddSwift cmake module.

if("${sdk}" STREQUAL "LINUX")
    list(APPEND result "-lpthread" "-ldl" "-Wl,-Bsymbolic")
  elseif("${sdk}" STREQUAL "FREEBSD")
    # No extra libraries required.
  else()
    list(APPEND result "-lobjc")
  endif()

I added that to my copy of AddSwift, and it seems to work beautifully. I will do some more testing and clean up, and submit a new pull request.

$ cat hello.swift 
print("Hello world!")
$ swiftc hello.swift 
$ ./hello
Hello world!
$ uname -s -m
Linux armv7l

2357 Words

2015-12-14T15:17:32