Challenges
You may face one or more of these challenges working with MS SQL Server databases on premise or in the Azure cloud:
- Data discovery: profiling, classification, ERDs
- Unloading and loading large tables
- Routine utility operations (reorgs)
- Slow queries, SSIS ETL jobs, or Power BI displays
- Data migration or replication
- Poor data quality or unity
- Masking PII in tables statically, or in applications like MS Dynamics dynamically
- Generating smart, safe test data for prototyping, DevOps, demos, etc.
Specific performance diagnoses and tuning also take time, and may affect other users. Finally, stored SQL procedures may also be programmed inefficiently, require optimization, and then still take too long to run.
Solutions
Simple, affordable IRI software addresses these issues. Services are also available for special needs.
To: |
Use: |
---|---|
Speed Unloads |
IRI FACT (Fast Extract) v2.5 to speed SQL Server unloads. Use SQL syntax in FACT's CLI or GUI to rapidly dump large tables to flat files. Specify SELECT * FROM very large tables so you do not encumber the DB with qualifiers like distinct, order by, OR group by. Once that data are in flat files, use the SortCL program in IRI CoSort to de-dupe, sort, join, group (and report on) the extracts far faster ... in parallel, outside the database. Use the IRI Voracity platform to combine FACT, CoSort, and your load utility or downstream applications in fast, affordable, visual ETL operations in Eclipse. |
Speed Loads |
IRI CoSort to pre-sort flat files for bulk bcp imports. Microsoft recommends that almost every SQL Server table should have a clustered index. Create the clustered index first, and pre-sort the input files (on the primary index key) to shorten the create index step. Use the SORTED_DATA option when creating indexes to reflect that CoSort bypassed the slower sort in bcp. |
Speed Reorgs and Queries |
The Offline Reorg wizard in the IRI Workbench to combine and automate the combination of the above: fast extraction, sorting, and re-loading operations for classic (offline) SQL Server reorgs. This helps maintain your tables in optimal query order. |
Optimize Transforms (ETL) and BI Data Preparation |
IRI CoSort to relieve the database of processing overhead. Leverage file system I/O, multi-threading, and the proven data transformation power and consolidation of the CoSort SortCL program. Use case: www.iri.com/customers/case-studies/cosort-transforms-big-marketing-data-10x-faster-than-sql If you need a faster alternative to SSIS or another ETL tool, see IRI Voracity, a total data management platform powered by CoSort (or Hadoop), and front-ended in Eclipse for access to (and integration and governance of) multiple data sources. Use CoSort or Voracity to wrangle data for Power BI, too. |
Migrate and Replicate SQL Server Databases |
IRI NextForm 'Database' edition to acquire, re-map, re-format, and build/populate new tables during migrations to and from SQL Server. You can also use IRI NextForm or the SortCL program in IRI CoSort or Voracity to re-map and convert data in SQL Sever, produce custom reports, replicas, and federated views of data. |
Cleanse and Enrich Data (Quality) |
CoSort or Voracity to find and fix errors in data values, ranges, format, or context. Scrub and standardize duplicate or fuzzy-matched values. See more data quality possibilities here. |
Classify and Mask Sensitive Data |
IRI FieldShield (or IRI DarkShield) or Voracity (has both) to find, classify, and differentially mask sensitive data in SQL Server, like personally identifiable information (PII) or protected health information (PHI). FieldShield applies masking, encryption, and other de-identifying functions to one or more columns and tables at once. DarkShield can also find and mask PII in FILESTREAM columns. And, both masking tools optionally integrate with Azure Key Vault for advanced encryption key management. Ccomply with privacy laws like HIPAA, PCI DSS, FERPA, and the GDPR! |
Generate SQL Server Test Data |
IRI RowGen (or Voracity) to rapidly populate SQL Server or SSIS ETL operations with safe test data. RowGen uses your data models to automatically generate the test data for an entire database with referential integrity. A subsetting wizard (with masking options) is another avaialable approach, which can also incorporate FieldShield functions at the same time to keep the subset data masked. All the above IRI operations are command-line driven, and can be executed with DB cloning tools like Windocks to provision virtualized test databases in a cloud or on-premise repository or container. FieldShield and RowGen also operate from within Azure DevOps CI/CD pipelines. |