Commit graph

8 commits

Author SHA1 Message Date
Daimona Eaytoy f8c525fc52 Minor updates related to var dumps
- Include an attempt to restore the dump in case the text table
   contains a truncated dump (not 100% sure that this can really
   happen, nor do I know the cause, but it shouldn't hurt)
 - Remove a check for 'action'. The variable might be missing in case of
   a corrupted dump. Having an array at that point can only mean "new
   format".
 - Don't assume that old_wikitext and new_wikitext are set when showing
   past filter hits (again, might be unset due to data corruption).

Bug: T264513
Change-Id: I7510d28fc3f43f985a1283e23b413f07adfe7921
2020-10-04 01:18:14 +02:00
Daimona Eaytoy 55ba083b13 Introduce a KeywordsManager service
This will decouple a bit the huge and chaotic tangle of AF classes. Some
boilerplate code for AbuseFilter services is also added with this patch.

Note that this requires injecting a KeywordsManager in
AbuseFilterVariableHolder, or unit tests would fail. This is still
incomplete, and the Manager is only injected in tests, because
VariableHolder still has to be refactored.

The test for the UpdateVarDumps script had to be updated, because
serializing VHs in there was a bad choice. As pointed out in a comment,
the test is likely going to break again once we remove the BC code, but
I hope that we'll be able to remove the test at that point.

Change-Id: I12a656a310adb8c5f75cab63f6db9e121e109717
2020-09-28 23:03:52 +00:00
Daimona Eaytoy d3b21901a2 updateVarDumps: Add more options, aesthetic changes
This fixes a few minor issues noted while running in prod. Notably:

 - Don't print "Printing orphaned records" in dry-run
 - Print progress markers every 10 batches, not every batch
 - Change the option for printing progress markers to take a file, and
 recommend against stdout for big databases.
 - Add an option to sleep between batches.

Bug: T252696
Bug: T246539
Change-Id: I970e7649472625ade003c259f98b611d9d3d69d2
2020-05-18 15:42:00 +02:00
Daimona Eaytoy 120d1500a4 updateVarDumps: wait for replication after each batch
Otherwise the script could destroy the wikis...

Change-Id: If5f86952cb4927d612d6e2243df0823025ec5bd5
2020-05-07 17:51:36 +02:00
Daimona Eaytoy 8d25609290 updateVarDumps: move MediaWikiServices away from constructor
Services are not available when the constructor is called. Since it was
used in a single method, replace the property with a local variable.
Pass it as a param, though, so that we have to instantiate the service
only once.

Bug: T213006
Change-Id: I4df27f3d02432c201c04d9fa118f0129b0a79778
2020-04-30 18:01:36 +00:00
Daimona Eaytoy 11f5790eb7 updateVarDumps: Print orphaned ES records, don't try updating ES records
Following discussion at T246938, just append data in the ES. Add a flag
to print what records will be orphaned (for manual cleanup, if needed).

Bug: T213006
Bug: T246539
Change-Id: I39bca2f07905cbf89e1906d60a568252b4729c98
2020-03-12 15:09:59 +01:00
Daimona Eaytoy cd1a8efb90 Minor fixes for the updateVarDumps script
- Increase batch size to 500
- Add an option to print progress markers
- Fix some bad logic which caused some JSONified data to be stored in
the text table without checking (and respecting) old_flags. This caused
some errors on the beta cluster.

Additionally, add a return typehint to AbuseFilter::loadVarDump to make
sure that errors are caught asap. Not only there's no apparent way that
loadVarDump can return an array, but most code is already using the
result as a VariableHolder, unconditionally. This is probably another
leftover from the past.

Bug: T213006
Bug: T246539
Change-Id: Iaebd28badb70d27693fa809cad4db956881e3e5e
2020-03-03 18:31:52 +00:00
Daimona Eaytoy 2c03c77d9f Add a maintenance script to clean afl_var_dump
This script aims to fix every problem reported in T213006. Subsequent
patches will add new code and drop the back-compat one.

Bug: T213006
Bug: T187153
Bug: T204236
Bug: T187731
Bug: T204235
Bug: T214193
Bug: T214196
Bug: T34478
Depends-On: I5b29ff556eca45fe59d15e2e3df4d06f1f6b3934
Change-Id: I22cf698c5be77506727cbd227c67e037a5d89b5c
2020-02-28 19:41:30 +00:00