Skip to content
This repository was archived by the owner on Nov 13, 2021. It is now read-only.
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Add period values for 'sec' and 'ms' granularities, as it appears the…
…y are missing.

If the granularity detection results in "sec" or "ms" granularity
detect_anons will blow up with a message stating 'period' was not
provided. Which is correct. This provides at least some defaults.

This is a bit tricky, since the number of samples really isn't guaranteed
to to be 1 measurement per whatever the 'gran' variable says it is.
However, this code appears to assume that. Leading to "fun" when the
precision is in ms but the measurements fire a lot less often - say, once
per five minutes (that's the testcase in question).

There are two ways to deal with this, neither of them implemented by this patch:
- Look at the delta between the first and the last timestamp and the number
of records and make an educated guess from there.
- Add 'period' as a function argument and let the caller decide, using the
hardcoded values simply as educated guesses.

These approaches can be combined, but that is left as an exercise for the reader.
  • Loading branch information
randakar committed Apr 18, 2016
commit 82a762aa5a79e1188d90110d075860838e272fb7
8 changes: 6 additions & 2 deletions R/ts_anom_detection.R
Original file line number Diff line number Diff line change
Expand Up @@ -166,11 +166,15 @@ AnomalyDetectionTs <- function(x, max_anoms = 0.10, direction = 'pos',
x <- format_timestamp(aggregate(x[2], format(x[1], "%Y-%m-%d %H:%M:00"), eval(parse(text="sum"))))
}

## This is a bit tricky, since the number of samples really isn't guaranteed to be 1 measurement per whatever the 'gran' variable says it is.
## Either we'll need to do something smarter (look at the delta between the first and the last timestamp and count the number of rows) or
## alternatively, simply make 'period' a function argument and use these values as defaults.
period = switch(gran,
min = 1440,
ms = 1000,
sec = 60*60,
hr = 24,
# if the data is daily, then we need to bump the period to weekly to get multiple examples
day = 7)
day = 7) # if the data is daily, then we need to bump the period to weekly to get multiple examples
num_obs <- length(x[[2]])

if(max_anoms < 1/num_obs){
Expand Down