When working with complex data pipelines, there are scenarios where you need to dynamically insert data from multiple source tables into target tables based on specific mapping and conditions. By leveraging metadata tables and stored procedures, you can create a robust and reusable framework to handle such requirements. In this blog, we’ll discuss how to achieve this dynamic insertion using metadata-driven logic, employ cursors (with a word of caution), and ensure proper allocation and deallocation of resources.
What is Metadata-Driven Data Insertion?
Metadata-driven data insertion uses a metadata table to map source tables, target tables, column mappings, and filtering conditions dynamically. The benefit is flexibility: adding new mappings or conditions simply requires updating the metadata table without changing the core logic.
Steps to Implement Dynamic Insertion
- Create the Metadata Table
This table will store information about the mappings between source and target tables, column names, and filter conditions. - Create Target Tables
Define the tables where the data will be inserted. - Write a Stored Procedure
This procedure will read the metadata table, build dynamic SQL based on the mappings, and insert data into the target tables. - Use Cursors (With Caution)
A cursor is used to iterate through the metadata table. While cursors are resource-intensive and should be avoided when possible, they can be appropriate for smaller datasets or certain dynamic operations. - Properly Allocate and Deallocate Cursors
To avoid memory leaks, ensure that cursors are properly opened, fetched, closed, and deallocated.
1. Creating the Metadata Table
The metadata table defines how data should flow from source tables to target tables.
CREATE TABLE MetadataTable (
ID INT IDENTITY(1,1) PRIMARY KEY,
SourceTable NVARCHAR(100),
TargetTable NVARCHAR(100),
SourceColumn NVARCHAR(100),
TargetColumn NVARCHAR(100),
WhereCondition NVARCHAR(MAX) — Dynamic WHERE clause
);
Example Data in MetadataTable
ID | SourceTable | TargetTable | SourceColumn | TargetColumn | WhereCondition |
1 | SalesData | StagingData | SalesID | TransactionID | Amount > 1000 |
2 | SalesData | StagingData | SalesAmount | TotalAmount | Amount > 1000 |
3 | CustomerData | MasterData | CustomerID | ClientID | Country = ‘US’ |
2. Creating Target Tables
Let’s define the target tables to receive the data.
Example Target Table: StagingData
CREATE TABLE StagingData (
TransactionID INT,
TotalAmount FLOAT
);
Example Target Table: MasterData
CREATE TABLE MasterData (
ClientID INT
);
3. Writing the Stored Procedure
The stored procedure dynamically generates and executes INSERT INTO statements based on the metadata.
CREATE PROCEDURE DynamicInsertProcedure
AS
BEGIN
— Declare variables for cursor iteration
DECLARE @SourceTable NVARCHAR(100);
DECLARE @TargetTable NVARCHAR(100);
DECLARE @SourceColumn NVARCHAR(100);
DECLARE @TargetColumn NVARCHAR(100);
DECLARE @WhereCondition NVARCHAR(MAX);
DECLARE @SQL NVARCHAR(MAX);
— Declare the cursor
DECLARE MetadataCursor CURSOR FOR
SELECT SourceTable, TargetTable, SourceColumn, TargetColumn, WhereCondition
FROM MetadataTable;
— Open the cursor
OPEN MetadataCursor;
— Fetch the first row
FETCH NEXT FROM MetadataCursor INTO @SourceTable, @TargetTable, @SourceColumn, @TargetColumn, @WhereCondition;
— Loop through the cursor
WHILE @@FETCH_STATUS = 0
BEGIN
— Construct dynamic SQL
SET @SQL = ‘INSERT INTO ‘ + @TargetTable + ‘(‘ + @TargetColumn + ‘) ‘ +
‘SELECT ‘ + @SourceColumn + ‘ FROM ‘ + @SourceTable +
‘ WHERE ‘ + ISNULL(@WhereCondition, ‘1=1’); — Default WHERE to always true if NULL
— Print the SQL for debugging (optional)
PRINT @SQL;
— Execute the dynamic SQL
EXEC sp_executesql @SQL;
— Fetch the next row
FETCH NEXT FROM MetadataCursor INTO @SourceTable, @TargetTable, @SourceColumn, @TargetColumn, @WhereCondition;
END;
— Close and deallocate the cursor
CLOSE MetadataCursor;
DEALLOCATE MetadataCursor;
END;
4. Running the Workflow
4.1 Insert Metadata
Populate the MetadataTable with your mappings.
INSERT INTO MetadataTable (SourceTable, TargetTable, SourceColumn, TargetColumn, WhereCondition)
VALUES
(‘SalesData’, ‘StagingData’, ‘SalesID’, ‘TransactionID’, ‘Amount > 1000’),
(‘SalesData’, ‘StagingData’, ‘SalesAmount’, ‘TotalAmount’, ‘Amount > 1000’),
(‘CustomerData’, ‘MasterData’, ‘CustomerID’, ‘ClientID’, ‘Country = ”US”’);
4.2 Execute the Procedure
Run the stored procedure to perform dynamic data insertion.
EXEC DynamicInsertProcedure;
5. Best Practices for Using Cursors
While cursors provide a straightforward way to process row-by-row logic, they should generally be avoided for large datasets due to their impact on performance. Here are some best practices:
- Use Cursors Sparingly: If possible, replace cursors with SET-BASED operations using JOIN or MERGE statements.
- Limit Cursor Scope: Use cursors only when dynamic, row-based logic is unavoidable.
- Properly Deallocate Resources: Always close and deallocate cursors to free up memory and avoid resource leaks.
- Optimize Metadata: Ensure that the metadata table is indexed and contains only necessary rows to minimize the cursor workload.
6. Benefits of This Approach
- Dynamic Logic: You can add, modify, or remove mappings without changing the stored procedure logic.
- Reusability: The same procedure works across multiple source-target pairs, reducing redundancy.
- Scalability: Handles diverse data flows with minimal changes to the core workflow.
- Debugging: Dynamic SQL can be easily debugged by printing the generated queries.
7. Challenges and Limitations
- Performance: Cursors can slow down execution for large metadata tables. Evaluate whether a set-based approach is feasible for your scenario.
- SQL Injection Risk: Ensure dynamic SQL strings are sanitized to prevent injection attacks.
- Complexity: Debugging dynamic SQL can be challenging if the logic is intricate.
Conclusion
Dynamic data insertion using metadata tables is a powerful pattern for managing data pipelines in SQL Server. While cursors are helpful for iterating through metadata, they should be used judiciously and always cleaned up after use. By structuring your metadata properly and leveraging stored procedures, you can build scalable and flexible workflows that adapt to changing requirements.
With careful design, this approach can streamline your ETL processes, reduce maintenance overhead, and improve overall data management.
Afroz Labbai
Data Engineering
Addend Analytics is a Microsoft Power BI-partners based in Mumbai, India. Apart from being authorized for Power BI implementations, Addend has successfully executed Power BI projects for 100+ clients across sectors like Financial Services, Banking, Insurance, Retail, Sales, Manufacturing, Real estate, Logistics, and Healthcare in countries like the US, Europe, Switzerland, and Australia. Get a free consultation now by emailing us at kamal.sharma@addendanalytics.com.