The char-based diff looked good in some pages, but yielded terrible results in
others. The word-based algo is more consistent overall.
Change-Id: I7f2d40315ad96df037c2d9a1d50739e3d21b6c81
The word or char-based algorithm does not scale well beyond 5k chars or so. We
now perform a line-based diff and then continue to diff the line differences
using the char-based algorithm. This gives a char-based diff even for bigger
inputs.
Change-Id: Iec87ca56540060e4df2859ba54c992e7ff5cfe10
* Stay in round-trip mode in HTML DOM output
* Return DOM, wikitext and diff as soon as they are available
Change-Id: I7f8f44cfe8eed63a521d1318d116c22232cb6b1b
* After installing Parsoid (sudo npm install -g in modules/parser), run 'node
server.js' from the api directory and navigate to http://localhost:8000/ and
follow the directions. You can start to navigate the English wikipedia at
http://localhost:8000/Main_Page, or manually enter wikitext or HTML DOM to
convert.
* Uses the express framework, could also use just connect
* Uses the cluster module to manage workers per-core and restart those on
failure
Change-Id: I443f2996ed3df00826b038b7476a2f966ab0c425