my talk from highload++ 2013 -- talking about scaling compiled applications but from the point of view of scaling up from supporting 1 platform to supporting MANY platforms.
in other words: given an application that supports ubuntu 10.04, what sort of systems, tips, and tricks are needed to help scale support to other ubuntus, redhats, centos, windows, etc.
21. Repeatable builds
• the most important thing for compiled software
• set up a build server
• with jenkins (or another type of software)
• backup your built objects
• copying them to s3 is not a horrible idea
• regardless of source control system or branching strategy,
ensure you can always rebuild any version of your software
22.
23. Jenkins problems
• git plugin didn’t work on windows (maybe fixed?)
• branch building is painful, jenkins API can help
• getting windows build agent working is painful
31. chroot:
an operation that changes the apparent root directory for the
current running process [...].
A program that is run in such a modified environment cannot
name (and therefore normally not access) files outside the
designated directory tree.
(from wikipedia)
40. KVM
• Create a base image on disk
• Clone base image
• Boot the cloned image
• Do the build and copy built object out.
• Delete cloned image when done
• Base image is still pristine and can be reused.
41.
42.
43. Create builds in cleanroom
• Avoid contaminating builds with artifacts from previous
builds.
• chroots help
• use mock or pbuilder for RPMs and DEBs
• KVM, EC2, or equivalent for everything else
• Always create pristine base images and do builds in a copy.
• Use SSDs
44.
45. Tool problems
• git-buildpackage can’t set changelog distribution field
• signing key setup is really painful (especially for RPMs)
• deb naming scheme for packages is quite painful
• all tools are undocumented and very hard to actually use
• recent versions of KVM provided by ubuntu fail to boot
VMs sometimes
46.
47.
48. two types of linking....
•
•
dynamic linking
static linking
49. static linking
• calls to functions in library are resolved at compile time.
• code from the library is copied into the resulting binary.
50. dynamic linking
• calls to functions are resolved at runtime.
• code for a library lives in it’s own object:
• libblah.so.4
• libblah.dll
58. static linking
• figure out which libraries your app needs
• pick a supported major release, if possible
• build and package this library
• link it statically against your binary during build
• you now have fewer stones to turn over when debugging
59.
60. static linking
• you will need to track your upstream deps
• you will probably want to package your upstream deps
• you can then point your chroots and build envs at
private deb, rpm, etc repos in your internal
infrastructure
• merging upstream changes in becomes its own project
75. Use the build system
Determine which file
to build at compile
time.
76.
77.
78. Use modules
• break up ifdef soup into separate files
• use the build system to compile the right file at build
time
• this seems obvious but many C libraries and
programs are full of crazy ifdef soup.
79.
80. Use modules
• very easy to fall down a rabbit hole breaking things apart
• can make the build process more complicated and
harder to debug
85. Capture debug symbols
• DEB and RPM can both output debug packages that
contain debug symbols.
• output these packages.
• store these and make backups.
• (or just don’t strip your binary)
86.
87. Use google-coredumper
• you can use google-coredumper to catch
segfaults, bus errors, and other bad things.
• you can output a coredump when this happens.
• you can use this coredump and your debug
symbols to figure out what is going on.
88. Plan for failure
• Have a backup plan
• Capture debug symbols during your automated build
process.
• Store them somewhere safe (and make backups).
• Capture coredumps (if possible).
• Use coredumps and debug symbols to figure out
what happened.
89.
90. Plan for failure
• can significantly increase complexity
• google coredumper can’t help if your kernel is
buggy
• some linux distributions don’t allow ptrace
• google coredumper only supports linux
99. Check things like...
• Is the binary actually statically linked?
• Does it get copied to the right path?
• Are the right config files autogenerated?
• Does the version string the program outputs match the
package version?
• ....
102. Automated Testing
• It will be impossible to build and test every change
on every supported platform.
• Use your build server to do this for you.
• Test things like:
• installing/uninstalling the object
• object is actually statically linked
• correctness testing (RSpec can be useful)
103.
104. Automated testing
• Fine line between not enough and too much
• Windows can be painful, but cygwin can help with scripting
• Easy to forget about testing the actual package install/removal
• Can be difficult to get working with branch builds
{"9":"i am talking about scaling ....\n","47":"linking!\n","28":"there are LOTS of different tools you can use to do this and there really is no “correct” answer. i’m going to name some tools that I found useful, but any tool that allows you to generate a cleanroom for a build is sufficient.\n","66":"i write a lot of C code, so this next example of why this mindset “just add one more thing” is bad is from an example i’ve seen again and again and again in C code.\n","76":"remember we want this\n","19":"jenkins allows you to setup build agents on different machines, i’ve been able to get build slaves working for lots of different versions of linux, windows, and smartos.\n","57":"you want everything to be neatly organized so you can easily find and fix bugs and performance problems.\nthe only way to ensure that the same version of libpcap is running on every operating system you support is to use the build system i talked about before to build and package a version of each library you depend on.\nwhen you do the final build of your actual binary, you static link those dependencies and you end up with a binary that was built from the same code on every platform you support.\n","95":"it will be impossible to manually check that packaged object compiles, installs, and works on every single platform.\n","38":"for everything else, you can use:\n- KVM, the built in linux kernel based virtualization\nor\n- Amazon EC2 to spin up and destroy the operating systems you want\nor\n- some other virtualization system like VMWare ESX or something else.\n","10":"compiled software to...\n","48":"there are two types of linking...\ndynamic linking\nand\nstatic linking\nand i am going to briefly describe them\n","29":"i’m going to describe some tools for buildings RPMs, DEBs, and everything else.\nthe reason why RPMs and DEBs are separate is because there are already a set of tools for building these types of objects in a cleanroom like environment via chroots.\nso i will describe those first, then move on to everything else.\n","20":"make sure to backup your built objects, somehow. copying them to s3 is a good idea.\n","96":"again, you can use something like jenkins to help you install and test your built objects on....\n","39":"luckily jenkins has plugins for KVM and EC2.\nunfortunately they may not actually do what you want, so you may need to check them out before you try using the plugins. the plugins are open source, or you could just script your KVM and EC2 usage yourself.\ni ended up scripting my KVM usage myself.\n","77":"not this\n","106":"repeatable builds\n","30":"to understand the next few tools i first need to explain what a chroot actually is at a high level.\n","11":"heterogeneous computer systems. \nwhat i mean is: scaling compiled software from platform like ubuntu 12.04 to 10 or 14 platforms, different ubuntus, centos, debian, etc.\n","49":"static linking ...\n","97":"lots of different platforms.\nfor the linux and unix-like platforms, you can write scripts that jenkins build slaves will run which will launch a VM, download your compiled package, install, and run it.\n","107":"builds in a cleanroom\n","31":"a ch-root is ....\nthis means that we can use a chroot to --->\n","12":"it might sound surprising but scaling a compiled piece of software to support multiple platforms is rife with problems.\nthis talk will cover some of the bigger lessons i’ve learned when i built, tested, and shipped software that had to run and work on many different platforms.\n","50":"dynamic linking...\n","98":"you can use cygwin or mingw to help you build scripts to test your packages on windows because, luckily, windows installer packages, MSI files, are installable on the command line.\n","108":"linking!\n","32":"create a cleanroom environment since any program running in a ch-root cannot access files outside the ch-root.\nthe next couple tools do exactly this: they make creating and using ch-roots easy. \n","13":"i want to start with what i think is probably the most important lesson i learned. if you fall asleep during my talk and take just 1 thing away i hope it will be this.\n","51":"ok so a quick tangent.\nthis is ulrich drepper. you may have heard of him, he is a super smart, really intense guy who is a glibc maintainer and involved with lots of other important open source projects that make linux and the linux userland possible.\nhe has lots of strong opinions, including a very strong opinion about static linking... let’s look at a blog post he wrote...\n","61":"ok number 4!!!\n","109":"modularity.\n","14":"the number 1 most important thing is **REPEATABLE BUILDS**\nyou want to know, for sure, that no matter what happens you can always --->\n","52":"static linking considered harmful\nconclusion: never use static linking\nhere’s a link to the blog post.. if you google ulrich drepper static linking, you will find it.\n","33":"the first time you create a chroot using these tools, it can be a bit heavy.\nthe userland portion of the operating system has to be downloaded, compressed, and written to disk.\n","71":"and here we are.\n","81":"last but not least... number 5...\n","24":"number 2!\n","62":"modularity.\n","110":"expect failure\n","15":"time travel back to a specific build number and regenerate that build as quickly and easily as possible. hopefully just 1 button click or 1 or 2 commands on the command line.\nyou want to be able to regenerate that build and examine the source if necessary.\n","53":"it is not really my place to say that some one as smart and experienced as ulrich drepper is wrong or misguided -- i’m sure he has really great reasons for his points in his blog post.\nhe is at least one million times smarter than me, but in the specific context we are talking about: scaling compiled applications, i have to disagree with him.\noverall, he is probably right, but for the context of this talk, in my opinion, static linking is actually quite useful.\n","91":"number 6!!!\n","34":"once this initial process is complete, using chroots becomes very lightweight because the base chroot....\n","72":"no please don’t do this\n","25":"clean room!\nyour builds for different platforms must be done in clean environments.\n","63":"so this part will probably seem obvious to most people, and even seemed obvious to me too but yet i still made this mistake and was left trying to untangle ....\n","6":"when people talk about scaling, usually you imagine building or modifying a backend web service to deal with increased traffic or more complex computations.\n","82":"expect failure\n","111":"automated TESTING.\n","54":"why?\n","92":"automated TESTING.\n","35":"can be re-used.\nfor example, you create a chroot for redhat 6 that has the base operating system and it is compressed on disk. when you do your build, the tool will extract the base chroot from the archive, do the build, and then throw away the extracted base.\nthe initial image you created is still pristine and can be re-used.\n","73":"there’s a better way!!\n","16":"If a customer reports a bug or problem with a specific build of your software, you don’t have to dig through a mess to try to get the exact copy of the source and binary they are using.\n","26":"when i tell people this, i usually get a lot of push back, especially when asking in IRC chatrooms for different operating systems. people ask me this question as if i’m crazy or something, but really the answer is quite simple.\n","64":"some code that looked like this.\n","7":"typically these applications are rolled out in your datacenter or on a public or private cloud. \nthese types of applications are, hopefully, rolled out to homogenous computer systems. systems with the same kernel version, same library versions, the same (or very similar) hardware vendors, and so on. \n","83":"because compiled software deployed on to systems you can’t control **WILL** fail somehow.\nmaybe some one will use a network card you dont support\nan experimental kernel\nthey might have bad RAM\nwho knows\n","112":"and, of course, be careful because a lot of these tools are buggy or poorly documented. \n","36":"if you are creating binaries which will be packaged in an RPM for centos, redhat, fedora, or other RPM based distributions, you can use mock.\nmock allows you to create easily create and manage chroots for RPM based distributions.\n","17":"you want your builds neatly organized, easily accessible, and easily reproducible.\nthe best way to accomplish this is by automating this process with a build server.\n","55":"because each operating system and each version of linux really is a unique and special snowflake.\nif your program uses libpcap, the version of libpcap on redhat6 is not the same as the version on debian6, even though both are linux.\nin fact, the version of libpcap on debian6 and ubuntu 12.04 are different and they are both debian based linuxes!\n","93":"when you move from supporting just one or two platforms\n","65":"why?\nwell i think it’s easy to make the excuse when you have deadlines coming up and say “OK, I’ll just add support for redhat6 really quickly and clean this up later.”\nIT’S ONLY A FEW LINES OF CODE.\n","8":"my talk today is about a different type of scaling. \n","46":"number 3 !!!\n","84":"expect failure.\nyour program will crash either because of a bug in your code, an unexpected system configuration, or a bug in a library you are linking in.\n","27":"builds need to happen in a clean room-like environment to prevent artifacts from other builds erroneously being included. for example: lets say the first build of your application has a config file in a certain location. that file gets written to disk in your build process. you then decide to change your application so that the config file gets written to a new place but you forget to change some part of the code which checks the location of the file. the code will build successfully because your old file from previous build is still in the right place.\nthis is just one example, but of course there could be plenty of other examples: applications which use plugins, linking against libraries (which i will talk more about shortly), and packaging as well.\n","37":"if you are creating binaries which will be packaged in a DEB for debian, ubuntu, or other deb based distributions, you can use pbuilder.\npbuilder, like mock, allows you to easily create and manage chroots for debian based distributions.\n","18":"i used jenkins, which seemed to work OK for me. there are other build servers out there you can use which i’m sure are just as good or maybe even better! the only one that i personally have used is jenkins.\n","56":"remember this mess? you end up back here. if a customer reports a bug or a problem in your software, you have no idea if the problem is actually in the specific version of libpcap or your application.\nwhat happens if the user has decided to overwrite the libpcap, maybe accidentally, with some new experimental version?\nif you rely on dynamic linking you can run into a huge mess of trying to track down what versions of what libraries people are using and how they are being used.\n","94":"to supporting MANY platforms\n"}