docs
Data Integration
Data Collections

Data Collections

Data Collections are the foundation of Octigen's data integration system. They organize your data files and define how those files are identified and accessed.

What is a Data Collection?

A Data Collection is:

  • A container for related data files
  • A schema defining how files are named and organized
  • A connection point between your data and slide templates

Think of it as a smart folder that knows what kind of files belong inside and how to find the right one when asked.

Creating a Data Collection

Navigate to Data Collections

  1. Go to your space's main page
  2. Click the "Data" tab
  3. Click "Add Data Collection"

Basic Information

Collection Name

  • Human-readable name for the collection
  • Example: "Monthly Portfolio Holdings"

Source Format

  • CSV - Comma-separated values; We support various formats such as comma-separated, semicolon-separated and tab-separated; As for encodings, please ensure that your CSV files are UTF-8 encoded.
  • Excel (.xlsx) - Microsoft Excel files with single sheets and raw data (no calculated columns supported at this point).

Source Connection

Currently only one type of source connection is supported: Manual upload.

Data Type

Currently only one data type is supported: Simple 2D table with header row.

Defining Parameters

Parameters identify individual files within the collection. For each parameter, you define:

Name

  • The parameter name (lowercase, use underscores)
  • Examples: valuation_date, portfolio_id, version

Type

  • date - Date values (represented in YYYY-MM-DD format within Octigen)
  • string - Text values (portfolio IDs, names)
  • integer - Numeric values (versions, sequence numbers)

Mapping

  • prefix - Parameter value appears at the start of the filename
  • suffix - Parameter value appears at the end of the filename
  • Coming soon: mapping by sheet name, mapping by column value.

Note: Since we currently have the two mappings "prefix" and "suffix", the number of useful parameters are effectively limited to two. This will change with later versions of Octigen introducing more types of mappings.

Example: Monthly Sales Data

Settings:

  • Collection Name: "Monthly Sales Data"
  • Source Format: CSV
  • Parameters:
    • valuation_date (type: date, mapping: suffix, required: yes)

Expected filenames:

sales_2024-12-01.csv
sales_2024-11-01.csv
sales_2024-10-01.csv

Example: Multi-Portfolio Holdings

Settings:

  • Display Name: "Portfolio Holdings"
  • Object Type: portfolio-holdings
  • File Type: Excel
  • Parameters:
    • portfolio_id (type: string, mapping: prefix, required: yes)
    • valuation_date (type: date, mapping: suffix, required: yes)

Expected filenames:

GROWTH_FUND_2024-12-01.xlsx
INCOME_FUND_2024-12-01.xlsx
GROWTH_FUND_2024-11-01.xlsx

Uploading Data Files

Drag and Drop Upload

  1. Open the data collection
  2. Drag files into the upload area, or click to browse
  3. Files are validated against the collection's parameter schema

Upload Validation

When you upload a file, Octigen validates:

  • Filename matches the expected pattern
  • Parameter values can be extracted
  • Parameter types are correct (valid dates, integers)

Validation examples:

Collection ParametersFilenameResult
valuation_date (suffix, date)sales_2024-12-01.csv✅ Valid
valuation_date (suffix, date)sales_20241201.csv✅ Valid (alternative format)
valuation_date (suffix, date)sales_december.csv❌ Invalid date
portfolio_id (prefix, string), valuation_date (suffix, date)FUND_A_2024-12-01.csv✅ Valid

Duplicate Detection

If you upload a file with the same parameter values as an existing file:

  • System detects the duplicate
  • You can choose to overwrite the existing file
  • Or cancel and keep the original

This prevents accidental data loss while allowing intentional updates.

Viewing Collection Contents

File List

The collection detail page shows all uploaded files with:

  • Filename
  • Parameter values extracted from the filename
  • File size
  • Upload date / last modified
  • Actions (download, delete)

Managing Data Files

Downloading

Click the download icon next to any file to download the original.

Deleting

Click the delete icon to remove a file. This action:

  • Cannot be undone
  • Affects future presentations only
  • Does not affect already-committed presentations

Updating

To update data for the same parameters:

  1. Delete previous file
  2. Upload the new file with the same filename
  3. Confirm you want to overwrite

Parameter Mapping Details

Prefix Mapping

The parameter value appears at the start of the filename, followed by an underscore.

{parameter_value}_{rest_of_filename}.{extension}

Example: portfolio_id with prefix mapping

  • Parameter value: GROWTH_FUND
  • Filename: GROWTH_FUND_2024-12-01.csv

Suffix Mapping

The parameter value appears at the end of the filename (before extension), preceded by an underscore.

{base_name}_{parameter_value}.{extension}

Example: valuation_date with suffix mapping

  • Parameter value: 2024-12-01
  • Filename: sales_data_2024-12-01.csv

Combining Mappings

Multiple parameters combine in the filename:

{prefix_param}_{base}_{suffix_param}.{extension}

Example: portfolio_id (prefix) + valuation_date (suffix)

  • Parameters: portfolio_id=INCOME, valuation_date=2024-12-01
  • Filename: INCOME_holdings_2024-12-01.xlsx

Date Format Flexibility

For date parameters, Octigen recognizes multiple formats in filenames:

FormatExampleAccepted
ISO2024-12-01
Compact20241201
Dot separator2024.12.01
Underscore separator2024_12_01

All formats are normalized internally, so:

  • Upload: sales_20241201.csv
  • Query: valuation_date=2024-12-01
  • Result: ✅ Match found

Best Practices

Naming Conventions

Collection IDs:

  • Be descriptive but concise
  • Avoid special characters

Parameter Names:

  • Use lowercase with underscores: valuation_date, portfolio_id
  • Match industry conventions where applicable
  • Be consistent across collections

Filenames:

  • Follow the parameter mapping pattern exactly
  • Use consistent date formats
  • Avoid spaces and special characters

Organization

One data type per collection:

  • ✅ "Sales Data" collection with sales files
  • ❌ "All Data" collection mixing sales, inventory, and HR files

Logical groupings:

  • By report type (monthly sales, quarterly reviews)
  • By data domain (portfolio, market, client)
  • By update frequency (daily, weekly, monthly)

Data Quality

Before uploading:

  • Verify column headers are correct
  • Check data completeness and quality

Consistent structure:

  • Same columns across files in a collection
  • Consistent data types per column
  • No extra rows or formatting

Common Issues

"Filename doesn't match parameter pattern"

Cause: The filename doesn't match the expected pattern for the collection's parameters.

Solution:

  • Check parameter mappings (prefix vs suffix)
  • Verify date format if using date parameters
  • Ensure all required parameters are in the filename

"Invalid parameter value"

Cause: A parameter value couldn't be parsed correctly.

Solution:

  • For dates: use YYYY-MM-DD or YYYYMMDD format
  • For integers: use only digits
  • For strings: avoid special characters

"Duplicate file detected"

Cause: A file with the same parameters already exists.

Solution:

  • First delete previous version of the data object with same parameters.

Next Steps