Tech:Incidents/2014-06-ExtensionIssues

This incident affected multiple extensions (listed below) for a period of ~12 days

Timeline

 * 7 June 2014 - JL - Upgrade to 1.23
 * 12 June 2014 - JL - Turn off extensions that have been causing issues on the farm.
 * 14 June 2014 - Addshore - Took a look at the issue but couldn't seem to nail it down in the time availible.
 * 19 June 2014 - Addshore - Echo rolled back and extensions re enabled. Thanks to ebernhardson in #wikimedia-corefeatures !

Description
This incident basically boils down to a change being merged in the Echo extension that broke multiple other extensions. The branch of echo for MW 1.23 included this patch and thus broke the extension! Rolling back to a previous version of Echo fixed the problem for now, although this does mean we are frozen at this version until the issue is resolved upstream.

The exceptions were basically all the same just with different magic words. As you can see Echo appears in the stacktrace.

Stacktrace
2014-06-18 23:27:13 prod4 extloadwiki: [7ad42023] [no req]  Exception from line 318 of /w/includes/MagicWord.php: Error: invalid magic word 'useliquidthreads'
 * 1) 0 /w/includes/MagicWord.php(241): MagicWord->load('useliquidthread...')
 * 2) 1 /w/includes/parser/Parser.php(4984): MagicWord::get('useliquidthread...')
 * 3) 2 /w/extensions/LiquidThreads/classes/Hooks.php(860): Parser->setFunctionHook('useliquidthread...', Array)
 * 4) 3 [internal function]: LqtHooks::onParserFirstCallInit(Object(Parser))
 * 5) 4 /w/includes/Hooks.php(206): call_user_func_array('LqtHooks::onPar...', Array)
 * 6) 5 /w/includes/GlobalFunctions.php(4004): Hooks::run('ParserFirstCall...', Array, NULL)
 * 7) 6 /w/includes/parser/Parser.php(275): wfRunHooks('ParserFirstCall...', Array)
 * 8) 7 [internal function]: Parser->firstCallInit
 * 9) 8 /w/includes/StubObject.php(99): call_user_func_array(Array, Array)
 * 10) 9 /w/includes/StubObject.php(119): StubObject->_call('firstCallInit', Array)
 * 11) 10 /w/includes/cache/MessageCache.php(1023): StubObject->__call('firstCallInit', Array)
 * 12) 11 /w/includes/cache/MessageCache.php(1023): StubObject->firstCallInit
 * 13) 12 /w/includes/cache/MessageCache.php(1000): MessageCache->getParser
 * 14) 13 /w/includes/Message.php(977): MessageCache->transform('', false, Object(Language), NULL)
 * 15) 14 /w/includes/Message.php(669): Message->transformText('')
 * 16) 15 /w/includes/Message.php(732): Message->toString
 * 17) 16 /w/extensions/Echo/Hooks.php(35): Message->text
 * 18) 17 [internal function]: EchoHooks::initEchoExtension
 * 19) 18 /w/includes/Setup.php(601): call_user_func('EchoHooks::init...')
 * 20) 19 /w/maintenance/doMaintenance.php(100): require_once('/usr/share/ngin...')
 * 21) 20 /w/maintenance/rebuildLocalisationCache.php(179): require_once('/usr/share/ngin...')
 * 22) 21 {main}

Affected Extensions

 * DPLForum
 * LiquidThreads
 * CSS
 * AJAXPoll
 * VoteNY
 * SubpageFun
 * RegexFun
 * Arrays
 * Comments
 * Disambiguator
 * HeaderTabs

In Hindsight

 * It would have been nice to be able to run the update to 1.23 and immediately spot this issue and thus roll back. Issues are currently hard to stop due to the number of sites and un indexed logs. Logstash would probably have helped here.
 * Upgrades should be scheduled and ALL possible staff should be around to help fix things, upgrades should not be done unless a plan is ready to roll everything back.

Actions taken

 * Bug filed to Wikimedia Bugzilla for Echo
 * Fixed at https://gerrit.wikimedia.org/r/#/c/152968/ and https://gerrit.wikimedia.org/r/#/c/152971/