This document details how Huntress Managed SIEM bills for and retains data, and makes that data available to customers.
Definitions
- Data Pool: The total data volume available for ingestion through the remainder of the subscription period. The total data volume is based on the number of log sources purchased and the ingestion allocation per log source. Usage is determined based on the total ingestion versus the total data volume. This allows for each individual data source to use a dynamic amount of data, so long as the aggregate data volume does not exceed the Data Pool. This is intended to provide predictability and flexibility in cases where a few data sources within an account occasionally surpass their ingestion allocation.
- Data Source: Any system or service that is generating data captured by the SIEM. This can be anything from an endpoint to a hardware device to a third-party SaaS application like an Okta or Duo. It’s possible for a single machine to have multiple sources associated with it. For example, a Windows endpoint can be collecting event logs as well as antivirus logs or web server logs.
- Filtered Data: Data that has been determined to provide no security relevant value and is not retained in any Huntress systems. This data will be dropped as close to the collection source as possible.
- Monthly Data Volume: The total volume of data an Account can ingest each month and still remain under the committed baseline for their subscription. Going over one month does not necessarily incur additional fees, this will depend on how much volume is remaining in the Data Pool.
- Parsed Data: Ingested data matching a known format where we have written code to extract fields from the data to allow us to better understand the reason for the event.
- Predictable Billing: A method of managing the annual data storage for an Account that favors truncating the oldest data over incurring additional costs beyond the remaining Data Pool.
- Raw Data: Ingested data that either does not match a known format or where we have not yet written parsing code to extract fields from the data.
- Smart Filtering: A Huntress specific technology intended to reduce the noise and extract the useful security-relevant signal from log sources and discard the rest.
Data Source Overview
When a partner subscribes to Managed SIEM, they will choose a tier that defines the minimum number of data sources they will commit to on a monthly basis. As a partner adds new data sources, either through adding endpoints that capture Windows event logs or connections that capture logs from third-party SaaS systems, they will increase the number of data sources. At some point the partner may have more data sources than their minimum commitment and at this point they will be charged for additional data sources.
The additional data sources will be calculated and charged in the same way that we calculate and charge for endpoints. We will bill based on a snapshot of existing data sources at the end of their monthly subscription period. If they have less than or equal to the minimum committed number of data sources, then they will be billed for the minimum. If they have more than the minimum commitment, they will be billed for the total number of data sources at the price per data source based on their subscription tier.
Data Volume Overview
In addition to the number of data sources, we will be tracking available data volume allocated for several different reasons including baseline committed data, rollover data, one-time purchased data, and credited data. Tracking this data will allow us to enforce ingestion maximums as well as smooth potential overages when a data source ingests more than the maximum threshold. When a subscription is started, we will add Data Volume to the Data Pool based on the annual committed data volume (monthly committed data sources times 10 GB times 12 months). When an additional data source (charged data source over the minimum commitment) is charged, we will add any remaining data volume to the Data Pool.
The Data Pool will be consumed each month based on how much data has been ingested. This will allow partners to account for inconsistencies in ingestion and will allow them to use the data volume they purchase when they are charged for additional data sources. Data Volume will be consumed from the Data Pool in an order based on the earliest expiration date of the Data Allocation.
For example, if a partner commits to 100 data sources per month their data pool would be 1,000GB. In the first month their actual usage is 120 data sources and ingests 900 GB of data, we would handle the data volume usage by adding 200 GB to the Data Pool, making it 1,200GB before subtracting 900GB from the Data Pool determined by usage. This would leave the partner with an additional 300 GB of data that could be consumed in a future month.
If instead, the same situation applied except the ingested data volume was 1,100 GB, we would add 200 GB to the Data Pool before subtracting 1,100 GB from the Data Pool (1,000 GB monthly minimum + 100 GB additional ingest), leaving the partner with an additional 100 GB of data that could be consumed in a future month.
The Data Pool provides flexibility in the billing model that allows partners to have months where they ingest more data than their monthly allocation without receiving an additional unexpected bill for the difference. As long as there are other months with less data ingested that can offset the months with overages, then the partner can expect a consistent spend.
All data allocations will expire after 12 months. This means that any remaining data volume from the first month of committed data will expire on the 13th month.
Per-Data Source Billing
By billing based on the number of Data Sources our partners will be able to plan their spend roughly around the number of devices and users they manage and the third-party services they use (identity providers, third-party SaaS, etc.). Each Data Source will be allocated and billed for 10 GB/mo of ingestion even if the device uploads less than the threshold. How we intend to handle overages will be covered later in this memo. The ingestion amount will measure the uncompressed data size that is uploaded to cold storage, but will not include the size of any Filtered Data.
Based on the metrics we collected since the launch of Managed SIEM, we have seen that with proper Smart Filtering we can reduce the amount of collected and ingested logs and filter significant amounts of noise to keep the data sources below the allocated limit. Even in cases where a Data Source may generate more than the allocation of logs for the month, this is often offset by other Data Sources that generate less than their allocation of logs for the month.
Smart Filtering
The intention of the Huntress Managed SIEM is that we leverage our expertise and knowledge of cyber security and incident response to determine which parts of the data are security relevant and should be captured and which parts should be considered mostly noise. This is the intent of the Smart Filtering feature of SIEM ingestion. There are a few cases where either we are unable to effectively perform Smart Filtering or where the partner has explicitly requested to disable the filtering.
We can realistically only provide Smart Filtering when we are able to determine what type of data is being ingested and what event it is describing. This is the ideal state, but may not be technically possible for a number of reasons either temporarily or permanently. In these cases we won’t be able to apply Smart Filtering and all ingested data will count towards the source ingestion threshold.
Predictable Billing
Predictable Billing will attempt to smooth potential overages due to months with increased ingestion paired with months with less ingestion. The way this works is that when a partner commits to some amount of spend with Managed SIEM, they are committing to some number of Data Sources and each source has a monthly allocation. These Data Source allocations will be added to the Account Data Pool at the beginning of the subscription period. By allocating the entire Data Pool at the beginning of the subscription, we ensure flexibility for months that may have more ingestion than the monthly allocation.
The biggest issue we have seen with other vendors’ approaches to fixed-spend billing with SIEM ingestion is that when the limit is reached, ingestion stops and all data for the remainder of the month is dropped. This is not ideal for the partner. With our predictable billing we will continue to collect data when the Data Pool limit is reached. The tradeoff will be that we will start to truncate the oldest data that we have stored for the partner. This way we can maintain a predictable cost for the partner with the tradeoff being that they will no longer have 12 months of data stored as we start to truncate the oldest data.
For example, if a partner commits to 10 data sources at 10 GB/mo of storage allocation, their monthly Data Allocation will be 100 GB and their Data Pool will contain 1,200 GB. If instead of the 100 GB/mo of ingestion, they end up generating 200 GB/mo, they will exhaust their Data Pool only 6 months into their subscription period. If this ingestion rate continues we will truncate the oldest data as new data is ingested, leading to a continuous 6 month data retention period, but the partner won’t incur any additional charges beyond what they committed to.
For a partner who has to set a yearly budget and cannot easily increase their budget, this gives them a continuous look-back period, which is not available from many other SIEM vendors. Some vendors will stop collecting new data when the monthly allocation is reached. In this case, assuming consistent ingestion, this would mean that data would be retained for 12 months, but only the first half of the month would have data. Days at the end of the month would have no data retained leading to big gaps in visibility. We believe that a continuous retention period is absolutely necessary and for cost conscious partners, we feel that the length of retention is a much better tradeoff to maintain consistent cost.
Customers should be aware that when adding additional data sources beyond their minimum commitment the monthly bill will reflect that total log source count by increasing at the same rate per log source as the committed minimum volume rate.
Billable log sources include any log source that has sent data in the last 30 days. After 30 days of inactivity a log source will be listed as inactive and no longer count as a billable source.
Purchasing Additional Ingestion Capacity
For those cases where partners are ingesting more data than they initially expected and Smart Filtering isn’t able to reduce the data volume further, we will provide a means for a partner to purchase additional ingestion capacity. This can be handled in a couple different ways. The partner could opt to increase their commitment at the current price tier and would be charged the increased rate for the remainder of their subscription period. Another possibility is that the partner increases their commitment to the next higher tier and replaces their existing 12 month subscription with a new 12 month subscription at the new tier with a cheaper per-GB price.
Log Retention and Access
Partner data will be held in active storage for 1 month, and long-term, cold storage, for 12 months, with the first month being in both active and cold storage simultaneously. The total duration of storage not exceeding 12 months. For the duration of the first month, all non-filtered data will be searchable via the SIEM console. For the remaining 11 months cold stored data may be “rehydrated” to active storage for search and compliance use cases. In an effort to protect ourselves from additional costs associated with excessive rehydrations we reserve the right to charge an additional fee of $1/GB of rehydrated data beyond an included 500 GB's of rehydration per year. We have no intention of charging our partners to rehydrate data, but it does incur a cost and we have built pricing in a way that does not include excessive overhead for these costs.
We also offer the ability to store logs for 90 days active and 7 years cold. This is an add-on SKU that is in addition to the standard Log Source billing. It is offered as a flat rate per log source and applied by organization. This 'extended retention' offering is provided to allow customers to meet per organization compliance requirements that require more than 1 year of log retention. Purchasing extended retention does not change the allocated ingestion pool of 10 gigabytes per log source per month. It extends the searchable active storage of logs from 1 month to 3 months, and the cold storage from 1 year to 7 years. Please reach out to your Account Manager for more information.