Sinthetic Labs | Blog

Welcome to our static blog. We use a thing called HTML. Crazy, right?

A look inside Facebook's source code

4th October, 2014 — Nathan Malcolm

Note: None of the code in this post was obtained illegally, nor given to me directly by any Facebook employee at any time.

I've always been a fan of Facebook from a technical point of view. They contribute a lot to the open source community and often open source their internal software too. Phabricator, libphutil, and XHP are great examples of that. For a while I contributed a bit to both Phabricator and XHP, and I ended up finding out a lot more about Facebook's internals than I intended. Read on...

It was mid-2013 and I was busy fixing a few bugs I had encountered while using Phabricator. If my memory serves me correctly the application was throwing a PhutilBootloaderException. I didn't have much knowledge of how Phabricator worked at the time so I googled the error message. As you'd expect I came across source code and references, but one specific link stood out. It was a Pastebin link.

Of course, this intrigued me. This is what I found...

[[email protected] ~/devtools/libphutil] arc diff --trace
>>> [0] <conduit> conduit.connect()
<<< [0] <conduit> 98,172 us
>>> [1] <exec> $ (cd '/home/emir/devtools/libphutil'; git rev-parse --show-cdup)
<<< [1] <exec> 13,629 us
>>> [2] <exec> $ (cd '/home/emir/devtools/libphutil/'; git rev-parse --verify HEAD^)
<<< [2] <exec> 17,024 us
>>> [3] <exec> $ (cd '/home/emir/devtools/libphutil/'; git diff --no-ext-diff --no-textconv --raw 'HEAD^' --)
>>> [4] <exec> $ (cd '/home/emir/devtools/libphutil/'; git diff --no-ext-diff --no-textconv --raw HEAD --)
>>> [5] <exec> $ (cd '/home/emir/devtools/libphutil/'; git ls-files --others --exclude-standard)
>>> [6] <exec> $ (cd '/home/emir/devtools/libphutil/'; git ls-files -m)
<<< [5] <exec> 73,004 us
<<< [6] <exec> 74,084 us
<<< [4] <exec> 77,907 us
<<< [3] <exec> 80,606 us
>>> [7] <exec> $ (cd '/home/emir/devtools/libphutil/'; git log --first-parent --format=medium 'HEAD^'..HEAD)
<<< [7] <exec> 16,390 us
>>> [8] <conduit> differential.parsecommitmessage()
<<< [8] <conduit> 106,631 us
>>> [9] <exec> $ (cd '/home/emir/devtools/libphutil'; git rev-parse --show-cdup)
<<< [9] <exec> 9,976 us
>>> [10] <exec> $ (cd '/home/emir/devtools/libphutil/'; git merge-base 'HEAD^' HEAD)
<<< [10] <exec> 13,472 us
>>> [11] <exec> $ (cd '/home/emir/devtools/libphutil/'; git diff --no-ext-diff --no-textconv --raw '00645a0aec09edc7f0f1f573032991ae94faa01b' --)
>>> [12] <exec> $ (cd '/home/emir/devtools/libphutil/'; git diff --no-ext-diff --no-textconv --raw HEAD --)
>>> [13] <exec> $ (cd '/home/emir/devtools/libphutil/'; git ls-files --others --exclude-standard)
>>> [14] <exec> $ (cd '/home/emir/devtools/libphutil/'; git ls-files -m)
<<< [11] <exec> 19,092 us
<<< [14] <exec> 15,219 us
<<< [12] <exec> 21,602 us
<<< [13] <exec> 43,139 us
>>> [15] <exec> $ (cd '/home/emir/devtools/libphutil/'; git diff --no-ext-diff --no-textconv -M -C --no-color --src-prefix=a/ --dst-prefix=b/ -U32767 '00645a0aec09edc7f0f1f573032991ae94faa01b' --)
<<< [15] <exec> 28,318 us
>>> [16] <exec> $ '/home/engshare/devtools/libphutil/src/parser/xhpast/bin/xhpast' --version
<<< [16] <exec> 11,420 us
>>> [17] <exec> $ '/home/engshare/devtools/arcanist/scripts/phutil_analyzer.php' '/home/emir/devtools/libphutil/src/markup/engine/remarkup/markuprule/hyperlink'
<<< [17] <exec> 490,196 us
>>> [18] <exec> $ '/home/engshare/devtools/arcanist/scripts/phutil_analyzer.php' '/home/engshare/devtools/libphutil/src/markup'
>>> [19] <exec> $ '/home/engshare/devtools/arcanist/scripts/phutil_analyzer.php' '/home/engshare/devtools/libphutil/src/markup/engine/remarkup/markuprule/base'
>>> [20] <exec> $ '/home/engshare/devtools/arcanist/scripts/phutil_analyzer.php' '/home/engshare/devtools/libphutil/src/parser/uri'
>>> [21] <exec> $ '/home/engshare/devtools/arcanist/scripts/phutil_analyzer.php' '/home/engshare/devtools/libphutil/src/utils'
<<< [18] <exec> 498,899 us
<<< [19] <exec> 497,710 us
<<< [20] <exec> 517,740 us
<<< [21] <exec> 556,267 us
>>> [22] <exec> $ '/home/engshare/devtools/libphutil/src/parser/xhpast/bin/xhpast'
<<< [22] <exec> 10,066 us
 LINT OKAY  No lint problems.
Running unit tests...
HipHop Fatal error: Uncaught exception exception 'PhutilBootloaderException' with message 'The phutil library '' has not been loaded!' in /home/engshare/devtools/libphutil/src/__phutil_library_init__.php:124\nStack trace:\n#0 /home/engshare/devtools/libphutil/src/__phutil_library_init__.php(177): PhutilBootloader->getLibraryRoot()\n#1 /home/engshare/devtools/arcanist/src/unit/engine/phutil/PhutilUnitTestEngine.php(53): PhutilBootloader->moduleExists()\n#2 /home/engshare/devtools/arcanist/src/workflow/unit/ArcanistUnitWorkflow.php(113): PhutilUnitTestEngine->run()\n#3 /home/engshare/devtools/arcanist/src/workflow/diff/ArcanistDiffWorkflow.php(1172): ArcanistUnitWorkflow->run()\n#4 /home/engshare/devtools/arcanist/src/workflow/diff/ArcanistDiffWorkflow.php(225): ArcanistDiffWorkflow->runUnit()\n#5 /home/engshare/devtools/arcanist/scripts/arcanist.php(257): ArcanistDiffWorkflow->run()\n#6 {main}

Okay — so this isn't exactly source code. It's just some command line output. But it does tell us some interesting information.

After this find, I went ahead and tried to similiar pastes which had to been made. I was not disappointed.

[25/10/2013] Promoting The Meme Bank (1/1) - Campaign Update Failed: Campaign 6009258279237: Value cannot be null (Value given: null) TAAL[BLAME_files,www/flib/core/utils/enforce.php,www/flib/core/utils/EnforceBase.php]

Now, this looks to be an exception which was caught and logged. What's interesting here is it shows us file names and paths. "flib" (Facebook Library) is an internal library which contains useful utilities and functions to help with the development. Let's go deeper..

[[email protected] ~/www] ./scripts/intl/intl_string.php scan .
Loading modules, hang on...
Analyzing directory `.'
Error: Command `ulimit -s 65536 && /mnt/vol/engshare/tools/fbt_extractor -tasks 32 '/data/users/ksalas/www-hg'' failed with error #2:

warning: parsing problem in /data/users/ksalas/www-hg/flib/intern/third-party/phpunit/phpunit/Tests/TextUI/dataprovider-log-xml-isolation.phpt
warning: parsing problem in /data/users/ksalas/www-hg/flib/intern/third-party/phpunit/phpunit/Tests/TextUI/dataprovider-log-xml.phpt
warning: parsing problem in /data/users/ksalas/www-hg/flib/intern/third-party/phpunit/phpunit/Tests/TextUI/log-xml.phpt
warning: parsing problem in /data/users/ksalas/www-hg/scripts/sandcastle/local_testing/script_for_test_commits.php
warning: parsing problem in /data/users/ksalas/www-hg/lib/arcanist/lint/linter/__tests__/hphpast/php-tags-script.lint-test
LEXER: unrecognised symbol, in token rule:'
warning: parsing problem in /data/users/ksalas/www-hg/scripts/intern/test/test.php
warning: parsing problem in /data/users/ksalas/www-hg/scripts/intern/test/test2.php
Fatal error: exception Common.Todo
Fatal error: exception Sys_error("Broken pipe")

Type intl_string.php --help to get more information about how to use this script.

Now we're getting to the good stuff. We have ksalas on dev578 running what seems to be a string parser. `intl_string.php` tries to run `/mnt/vol/engshare/tools/fbt_extractor`, so we know for sure there are some other scripts in `/mnt/vol/engshare/`. We can also see they use PHP Unit for unit testing, and "www-hg" shouts Mercurial to me. It's well known they moved from Subversion to Git -- I'd put money on it that they've been experimenting with Mercurial too at some point.

"That's still not god damn source code!" I hear you cry. Don't worry, someone posted some on Pastebin too.

Index: flib/core/db/queryf.php
--- flib/core/db/queryf.php
+++ flib/core/db/queryf.php
@@ -1104,11 +1104,12 @@
  *  @author rmcelroy
 function mysql_query_all($sql, $ok_sql, $conn, $params) {
+  FBTraceDB::rqsend($ok_sql);
   switch (SQLQueryType::parse($sql)) {
     case SQLQueryType::READ:
       $t_start = microtime(true);
       $result = mysql_query_read($ok_sql, $conn);
       $t_end = microtime(true);
       $t_delta = $t_end - $t_start;
       if ($t_delta > ProfilingThresholds::$queryReadDuration) {


The file in question is `flib/core/db/queryf.php`. At first glance we can tell it's a diff of a file which contains a bunch of MySQL-related functions. The function we can see here, `mysql_query_all()`, was written by rmcelroy. From what I can see in the code it's pretty much a simple function which executes a query, with a little custom logging code. It may be more complex but unfortunately we may never know. :(

I'll post a few more example of code I've found, all of which (and more) can be downloaded from the bottom of this post.

diff --git a/flib/entity/user/personal/EntPersonalUser.php b/flib/entity/user/personal/EntPersonalUser.php
index 4de7ad8..439c162 100644
--- a/flib/entity/user/personal/EntPersonalUser.php
+++ b/flib/entity/user/personal/EntPersonalUser.php
@@ -306,13 +306,15 @@ class EntPersonalUser extends EntProfile

   public function prepareFriendIDs() {
-    // TODO: add privacy checks!
     return null;

   public function getFriendIDs() {
-    return DT('ReciprocalFriends')->get($this->id);
+    if ($this->canSeeFriends()) {
+      return DT('ReciprocalFriends')->get($this->id);
+    }
+    return array();

@@ -397,6 +399,7 @@ class EntPersonalUser extends EntProfile
+        PrivacyConcepts::FRIENDS,
         // Note that we're fetching GENDER here because it's PAI
         // so it's cheap and because we don't want to add a prepareGender
         // call here if we don't have to.
@@ -418,6 +421,10 @@ class EntPersonalUser extends EntProfile
     return must_prepare($this->viewerCanSee)->canSee();

+  protected function canSeeFriends() {
+    return must_prepare($this->viewerCanSee)->canSeeFriends();
+  }

# update your local master branch
  git checkout master
  git pull --rebase

# never do any work on master branch
# create & switch to new branch instead
  git checkout -b my_branch

# rebase 'my_branch' onto master
  git checkout my_branch
  git rebase master

# list branches
  git branch

# delete 'my_branch' branch
  $ git branch -d my_branch

# shows status
$ git status

stage file, also remove conflict
  $ git add <file>

revert file to head revision
  $ git checkout -- <file>

commit change
  $ git commit -a --amend
    -a       stages all modified files
    --amend  overwrites last commit

show all local history (amend commits, branch changes, etc.)
  $ git reflog

show history (there is lot of options)
  $ git log
  $ git log --pretty=oneline --abbrev-commit --author=plamenko
  $ git log -S"text to search"

show last commit (what is about to be send for diff)
  $ git show

get the version of the file from the given commit
  $ git checkout <commit> path/to/file

fetch & merge
  $ git pull --rebase

resolving conflicts:
  use ours:
    $ git checkout --ours index.html
  use theirs:
    $ git checkout --theirs index.html

commit author:
  $ git config --global "Ognjen Dragoljevic"
  $ git config --global [email protected]

  After doing this, you may fix the identity used for this commit with:
  $ git commit --amend --reset-author

commit template:

rename a branch:
  $ git branch -m old_branch new_branch

interactive rebase
  $ git rebase -i master
    make changes
    $ git commit -a --amend
    $ git rebase --continue
    $ arc diff
    $ arc amend
    $ git push --dry-run origin HEAD:master // remove dry-run to do actual push

to update commit message in phabricator
  $ arc diff --verbatim

# Creates a new www sandbox managed by git.
# Usage: git-clone-www [dirname]
# dirname defaults to "www-git".



# Are we running on a machine that has a local shared copy of the git repo?
if [ -d /data/git/tfb ]; then
  # Yes. Reuse its objects directory.
  echo "Cloning the local host's shared www repository..."
  # Nope, copy the NFS server's objects locally so as not to be dog slow.
  echo "Copying from the shared www repository on the NFS server..."

if [ ! -d $HOME/local ]; then
  echo "You don't seem to have a 'local' symlink in your home directory."
  echo "Fix that and try again."
  exit 1

cd $HOME/local
if [ -d "$DIRNAME" ]; then
  echo "You already have a $DIRNAME directory; won't overwrite it."
  echo "Aborting."
  exit 1

# We clone the shared repository here rather than running "git svn clone"
# because it's much, much more efficient. And the clone has some options:
# -n = Don't check out working copy yet.
# -s = Reference the origin's .git/objects directory rather than copying.
#      Saves gobs of disk space and makes the clone nearly instantaneous.
#      We don't do this if there's no local-disk shared repo.

git clone $SHARE -n "$PARENT" "$DIRNAME"


# If we're sharing a local repository's objects, use the NFS server as a
# fallback so stuff doesn't break if we use this repo from another host
# that doesn't have a /data/git/tfb directory.
if [ -s $ALTERNATES ]; then
  echo $NFS_REPO/.git/objects >> $ALTERNATES

# We want to use the same remote branch name ("remotes/trunk") for git-svn
# and for fetches from the shared git repo, so set that up explicitly.
git config remote.origin.url "file://$PARENT/.git"
git config remote.origin.fetch refs/remotes/trunk:refs/remotes/trunk
git config --remove-section branch.master

# Enable the standard commit template
git config commit.template /home/engshare/admin/scripts/templates/git-commit-template.txt

# Enable recording of rebase conflict resolutions
git config rerere.enabled true

# Now fetch from the shared repo. This mostly just creates the new "trunk"
# branch since we already have the objects thanks to the initial "git clone".
git fetch origin

# Blow away the "origin/" branches created by "git clone" -- we don't need them.
rm -rf .git/refs/remotes/origin

# Now it's time to turn this plain old git repo into a git-svn repo. Really
# all we need is the svn-remote configuration (installed above) and a
# metadata file with some version information. git-svn is smart enough to
# rebuild the other stuff it needs.

echo ""
echo "Synchronizing with svn..."

git svn init -itrunk svn+ssh://tubbs/svnroot/tfb/trunk/www

# Now tweak the git-svn config a little bit so it's easier for someone to
# go add more "fetch" lines if they want to track svn-side branches in
# addition to trunk. This doesn't affect any of the existing history.
git config svn-remote.svn.url svn+ssh://tubbs/svnroot
git config svn-remote.svn.fetch tfb/trunk/www:refs/remotes/trunk

# Let git-svn update its mappings and fetch the latest revisions. This can
# spew lots of uninteresting output so suppress it.
git svn fetch > /dev/null

echo ""
echo "Checking out working copy..."

# We use git reset here because the git svn fetch might have advanced trunk
# to a newer revision than the master branch created by git clone.
git reset --hard trunk

if [ ! -d "$HOME/$DIRNAME" ]; then
  echo ""
  echo "Making home dir symlink: $HOME/$DIRNAME"
  ln -s "local/$DIRNAME" "$HOME/$DIRNAME"
  echo ""
  echo "$HOME/$DIRNAME already exists; leaving it alone."

echo ""
echo "All done. To make this your new main sandbox directory, run"
echo ""
echo "    rm -rf ~/www"
echo "    ln -s ~/$DIRNAME ~/www"
echo ""


Lastly, I wanted to share something which I found quite amusing. Facebook's MySQL password. This came from what seems to be a `print_r()` of an array which made its way in to production a few years ago.

array ( 'ip' => '', 'db_name' => 'insights', 'user' => 'mark', 'pass' => 'e5p0nd4', 'mode' => 'r', 'port' => 3306, 'cleanup' => false, 'num_retries' => 3, 'log_after_num_retries' => 4, 'reason' => 'insights', 'cdb' => true, 'flags' => 0, 'is_shadow' => false, 'backoff_retry' => false, )

Host: (Private IP)
Database Name: insights
User: mark
Password: e5p0nd4

Okay, so it's not the most secure password. But Facebook's database servers are heavily firewalled. Though if you do manage to break in to Facebook's servers, there's the password.

Edit: Mark Zuckerberg was an officer at the Jewish fraternity Alpha Epsilon Pi. The motto on their coat of arms is "ESPONDA". :-)

So what have we learnt today? I think the main thing to take away from this is you shouldn't use public services such as Pastebin to post internal source code. Some creepy guy like me is going to collect it all and write about it. Another thing is to make sure debug information is never pushed to production. I didn't put much effort in to this but there will be more of Facebook's source code floating around out there.

Again I'd like to stress that everything I have posted here was already available on the Internet. All I needed to do was search for it. And here's the download:


If you enjoyed this post and want to see more, follow @NathOnSecurity on Twitter.

This is a static site. There is no server side logic. Please point your scanners elsewhere.

Are your eyes burning?
Donations towards research are greatly appreciated:
BTC: 14W72Ktt7zgfoRhg48ftqdUw938y5vBLWr