- Both methods split data to establish separate baselines.
- Can be used separately or applied together in one detector (i.e. count by error_type partition_field=host)
If you want to “hard split” the analysis, select an “partition_field_name”
- The field chosen should have < 10,000 distinct values per job, in general as more memory is required to partition
- Each instance of the field is like an independent variable
- Scoring of anomalies is more independent
If you want a “soft split”, select a “by_field_name”
- The field chosen should have <100,000 distinct values per job, in general
- More appropriate for attributes of an entity (dependent variables)
- Scoring considers history of other by-fields