Loading ...

Petabyte Archiving

Documenos ®

Petabyte Scale Archiving Infrastructure Documenos Archive Management Technical Architecture

1. Scalable Infrastructure:


  • ASP.NET Core based development
  • PostgreSQL (v15 and later) database compatibility
  • Architecture suitable for horizontal and vertical expansion
  • Cluster support
  • Seamless access behind load balancer
  • Each server can connect to the same database.
  • Horizontal capacity can be expanded by increasing the number of servers.

2. Archive Tiering and Data Flow
With background services:


  • Automatic upload from file system to archive
  • Upload data to archive from FTP
  • Transfer between archive layers
  • Batch extract operations
  • Physical cleaning of deleted data
  • Services can be controlled on a server basis.

3. Content Processing Services
For archived content:


  • Image processing (thumbnail, resizing, EXIF/IPTC extraction)
  • PDF text indexing
  • OCR tasks
  • Speech → text conversion in video files (60 languages)
  • Speech → text conversion in audio files (60 languages)
  • Generated texts are saved to search indexes.

4. Data Model and Metadata Capacity
Support for 1.4 billion tables in dynamic datasets
For each table in dynamic datasets:


  • 250 fields recommended usage
  • 1600 fields technical upper limit
  • ~4.2 billion rows data storage capacity

Datasets:


  • Authorization-based visibility
  • Excel upload / download
  • Automatic data update with event trigger

5. Security and Access Control


  • LDAP integration
  • Built-in password system
  • MFA (OTP, Email, SMS)
  • IP-based access restrictions
  • Group-based authorization
  • Declaration that no backdoors are defined in the software