elt-snap_smlogo.png

eltSnap Desktop

Introduction

eltSnap is a Windows-based application which allows users to orchestrate and control the data integration processes, such as the ELT, which is used for populating a Data Warehouse.  It has been built whilst focusing on the ELT paradigm, representing Extract, Load, and Transform activities which leverages the elasticity available in modern, cloud-based, data warehouses.

 

eltSnap uses the common Project/Package model to define data integration steps and groupings. It comes with a custom runtime engine which supports both on-prem and cloud-based database platforms.

 

eltSnap consist of the following parts:

1. eltSnap Databases, Microsoft SQL Server Databases in which all configurations, as well as logs, are kept.

2. eltSnap Runtime, a Python application which allows the users to execute their projects/packages in which data integration steps are defined. It uses the configurations saved in the eltSnap databases, and can be invoked directly through the console.

3. eltSnap Desktop, a Windows desktop application which allows the users to configure their data integration steps. It connects to the eltSnap databases in order to retrieve and update configurations. It can also connect to the runtime in order to invoke the desired project/package execution.

4. eltSnap Control Panel, a Windows desktop application which allows the users to monitor the execution of their projects and packages. It retrieves the information about the execution from eltSnap databases.

How eltSnap works

eltSnap controls the complete data integration process by allowing the user to define projects, packages, connections, parameters and environments which contain the configurations for the data integration step they wish to execute. These objects do not have to be defined from scratch for every data integration step; they can be reused as many times as the users need (by being added to the desired projects or cloned).

Data integration configuration information is defined by using the following eltSnap objects:

  1. Connections - define the data necessary for connecting to database and email servers, as well as flat files which contain exported data;
     

  2. Parameters -encompass values which are often prone to change so that, when they are changed, the change needs to be made only in one place, from which it will be propagated to all the places in which the parameter has been used;
     

  3. Packages - define data integration steps which include invoking processes, transferring data from one database to another and executing transformations on the extracted data;
     

  4. Environments - define in which conditions the projects should be run, and usually discern the devtest and prod development phases;
     

  5. Projects - containers in which all other objects are located and are used for the execution of multiple packages, as well as for distinguishing between different data integration processed.

All eltSnap objects are simple and can be created in a couple of clicks. In order to define them correctly, intuition, common sense and basic knowledge about the environment for which the data integration needs to be defined and executed is more than enough. It is important to note that no advanced SQL programming skills are required in order to be able to use eltSnap, so even users with no previous experience can start using it quickly and easily. Basic knowledge about CRUD operations is enough to define a package and successfully execute the desired actions, especially due to the fact that most of the configurations are automatically created, so that the user does not have to worry about all the details and get to the execution phase as soon as possible.

 
Projects

 

Creating a new project is one of the easiest eltSnap features. Due to the fact that projects are containers, configuring them does not require entering a lot of information. The user needs to  specify the project name and template group, and their project will be ready for usage. Existing or new connections, parameters, packages and environments can then be added to the new project, after which the project can be executed independently of all other projects and packages.

Template groups define specific behavior of the project execution. Choosing a template group for a project means allowing automatic configuration that will define new connections and parameters depending on the type and purpose of the chosen template group. The template group can be changed before every project execution, making it even easier for users to manage project execution under different circumstances.

The following template groups are supported, with each template group providing extended features of the previous one:

  1. No Framework - no automatic configurations are made for the project execution. Recommended for experienced users who wish to manually configure their logging framework.
     

  2. Framework Logging - adds automatic logging for project execution. This allows the user to get detailed information about the execution (including the number of rows which were modified, informational and error messages), which can be very useful when project execution fails due to error. Due to more detailed monitoring of project execution, this template group also allows restarting the execution from the point of failure instead of from the first package, which can save a lot of time when there is no need to run all project packages again.
     

  3. Framework Logging with Alert Emails - apart from enabling the previously described features, this template group automatically configures sending alert mails every time an error is encountered while running a project. This can be a very important asset when executing a project takes a long time and the execution cannot be manually monitored. It is critical to be able to fix the errors as soon as possible and continue the execution, which can be accomplished with the automatic setup of alert emails.
     

  4. Framework Logging with Alerts and Informational Emails - apart from enabling all previously described features, this template group automatically configures sending informational emails for each project execution. Informational emails have a more user-friendly interface than dashboard logs, especially if a project contains many packages and there is a lot of data.

 
Connections

 

Creating a new connection means defining all the necessary information needed for data warehouse integration. It connects the user to a location where data is stored or to which the data needs to be extracted. Depending on the project pattern, a package may require one, two or even more predefined connections (except for the Execute Process Pattern, for which no connection is required, due to the fact that it is process, not data-oriented).

Connections are the key part of eltSnap, because no data integration process can take step if the data cannot be successfully retrieved. Connections are sensitive, due to the fact that even the slightest change in one of its parameter values results in being unable to connect to the data source. This is why configuring connections should be done with utmost care.

When a new connection is configured, parameters for the most important fields (e.g. server name, database name) are automatically created and can be used in packages. Connections are also linked to projects, meaning that an existing connection cannot be used in a package if it has not been added to the project of the package. All eltSnap objects can easily be added to projects directly from the dashboard.

eltSnap offers different kinds of connections, which are used for different parts of the data integration process. The data is usually stored in databases, but it can also be exported and saved in a file, so connections of this type are also available. Moreover, connections to mail servers also need to be established for some eltSnap features. Depending on the type of the connection, different information needs to be entered, and different possibilities for usage in packages exist (i.e. the Execute SQL Pattern allows using only connections to databases).

 
An overview of all available connection types is provided in the following segments.

 

OleDb Connections

 

This connection type is chosen when the user wants to connect to any type of data storage (as opposed to connecting to a flat file or an email server), which can be accessed by using a connection string. OleDb connections are the most commonly used in eltSnap, considering that the data which needs to be used in the data integration process is usually located in a database on a database server. Due to the fact that most database servers offer the possibility to connect via a connection string, be it a local or cloud-based storage, this makes the possibilities of eltSnap connections limitless.

 

In order to create a new OleDb connection, it is necessary to choose the Connection main menu option, then choose OleDb Connection, and then choose the Add New option.

 

The following data needs to be provided:

  • Connection Name - the name by which the connection will be identified in eltSnap. This name does not have to be the same as the name of the database, and can be manually chosen by the user, but it is recommended to use the database name in cases where it is easy to memorize.
     

  • Server Name – the name of the database server where the desired data is located. This field is case-sensitive, so the user needs to make sure that the server name they enter is identical to the name of their chosen database server.
     

  • Database Name – the name of the database where the desired data is located. This field is case-sensitive, and the user needs to make sure that the database with the name they enter exists on the database server they previously referenced.
     

  • Provider Name - the identifier of the provider for the database server where the desired data is located. The default value, SQLNCLI11, is the provider name for Microsoft SQL Server, the most commonly used database server in eltSnap.
     

  • Custom Connection String – the complete connection string for the database where the desired data is located. Using this field overrides using all other fields, due to the fact that other fields have the purpose of forming the connection string, which is in this case provided in full and will be used as such. Custom connection strings are usually specified in situations where the previously mentioned fields do not suffice and other fields need to be added to the connection string.

The Connection Expression field is not editable, and contains the full connection string of the connection (which is computed after the connection is created). If the connection cannot be successfully established, the user can compare the value of this field with the connection string provided by their database server to check for configuration mistakes.

SMTP Connections

 

This connection type is used for connecting to email servers, which is usually necessary for the Framework Logging with Alerts and Informational Emails template group usage. After successfully establishing a connection to an email server, emails about the project execution can be sent to the user. SMTP connenctions can also be manually created.

 

In order to create a new SMTP connection, it is necessary to choose the Connection main menu option, then choose SMTP Connection, and then choose the Add New option.

 

The following data needs to be provided:

  • Connection Name - the name by which the connection will be identified in eltSnap. The name can be manually chosen by the user, although in some cases where the server name is easy to remember, the server name can be used as the connection name.
     

  • Server Name - the name of the email server to which the user wishes to connect. This field is case-sensitive, so the user needs to make sure that the server name they enter is identical to the name of their chosen email server.
     

  • SMTP Username - the username credentials for connecting to the email server. This field is optional, due to the fact that not all email servers require providing credentials.
     

  • SMTP Password - the password credentials for connecting to the email server. This field is optional, due to the fact that not all email servers require providing credentials.

The Server Name Expression field is not editable, and contains the parameter reference of the Server Name field (which is computed after the connection is created).

Flat File Connections

 

This connection type allows the user to connect to a data source defined in a custom (so-called flat) file. Flat files are usually the product of exporting data from a database. JSON files are most commonly used type of file formats, although the markup can be manually configured and the data exported to any user-chosen (or manually created) file format.

 

In order to create a new Flat File connection, it is necessary to choose the Connection main menu option, then choose Flat File Connection, and then choose the Add New option.

 

The following data needs to be provided:

  • Connection Name - the name by which the connection will be identified in eltSnap. The name can be manually chosen by the user.
     

  • File Path - the full path to the location of the desired flat file on the operating system being run.
     

  • File Format - the custom format of the file where the data is located. Only previously defined flat file formats can be used, since configuring a flat file format requires unique information. Flat file formats are explained in detail in the next section.

 

The File Path Expression and File Format Expressions fields are not editable, and contain the parameter reference of the File Path and File Format fields respectfully (which are computed after the connection is created).

Flat File Formats

 

Flat file formats define the way in which the data is stored in custom flat files. Due to the fact that it is easier to transport and duplicate data when it is exported to a flat file than by performing multiple operations directly on the databases, it is important to be able to read and distinguish the data defined in the file. In order for eltSnap to be able to import (and export) the data from these files, specifying all details about how the data is marked in the file is necessary.

In order to be able to work correctly, the flat file connection needs to have a valid flat file format, so its correct configuration is very important and should be done with caution.

 

In order to create a new Flat File format, it is necessary to choose the Connection main menu option, then choose Flat File Connection, and then choose the Flat File Format option. The list of all existing flat formats will be shown on the screen, after which it is necessary to click the Add button.

 

The following data needs to be provided:

  • File Format – the name of the user-defined file format. This value is manually chosen by the user, and appears in the list of available flat file formats when creating a new flat file connection.
     

  • Code Page – the encoding of characters used in this format. Has a default value of 1252 (single-byte Latin characters).
     

  • Column Delimiter – the character used for separating database table columns. Has a default value of the most-commonly-used comma (,) character.
     

  • Row Delimiter – the character used for separating database table rows. Has a default value of the most-commonly-used CRLF (positioning the cursor to the beginning of the new line) character.
     

  • Text Qualifier – the character used for encompassing text values of database cells. Has a default value of the most-commonly-used _x0022_ character (used in Microsoft SQL Server SSIS packages).
     

  • Metadata Server – the name of the database server on which the metadata for the data defined in the flat files with this format is stored. This field is case-sensitive.
     

  • Metadata Database – the name of the database in which the metadata for the data defined in the flat files with this format is stored. This field is case-sensitive.
     

  • Metadata Schema – the name of the database schema in which the metadata for the data defined in the flat files with this format is stored. This field is case-sensitive.
     

  • Metadata Table – the name of the database table in which the metadata for the data defined in the flat files with this format is stored. This field is case-sensitive.

Parameters

 

Parameters are eltSnap objects which encompass a value and allow it to be reused in multiple packages. This means that the control of the parameter value is centralized, allowing for automatic propagation of changes to all packages where the parameter is used. This can be especially useful for values that are prone to changes, such as user passwords, allowing easy and quick control for the user.

 

Like all other objects, parameters are automatically added to the currently active project and can be reused by being added to other projects as well, or cloned and modified to match the new configuration.

In order to create a new parameter, it is necessary to choose the Parameter main menu option and then choose the Add New option.

 

The following data needs to be provided:

  1. Parameter Name - the name by which the parameter will be identified in eltSnap. The name can be manually chosen by the user.
     

  2. Parameter Type - the type of parameter value. The following types are supported: string, directory, username, password, HTTP address, file, and determine how the value will be handled when resolving the parameter value, as well as allowing filtering based on parameter type for different purposes.
     

  3. Parameter Value - the value of the parameter, which is the value to which the parameter reference will resolve when executing packages.

 

The Parameter Reference field is not editable, and contain the value of the reference for the parameter (the value which will be shown in the configurations when the parameter is used in packages). It is automatically computed after the parameter is created.
It is also possible to use Azure Key Vault secrets as parameters. After checking the "Value is stored in Azure Key Vault", the user will be taken to the authentication page for their Azure account. It is important to have all prerequisites pre-installed in order for the feature to work correctly. The user can follow the instructions they can find by clicking on the ? (Help) button on the Add New Parameter form, where all relevant information about the prerequisites are located.
Adding Azure Key Vault secrets as parameters allows the user to retain the security of sensitive information, due to the fact that the values of Azure Key Vault parameters are not saved in the configuration database, but are resolved every time the parameter value is needed. This means that the sensitive data of the user is not exposed at any point.
 
Packages

 

Packages are the central part of eltSnap, as they allow the user to define the most important thing - how exactly the data integration process will be executed. Creating a package means defining the configuration of the desired data integration step. Multiple packages of different types can define different steps required for a single data integration process, which helps granulate the complicated data integration process and therefore make it easier to configure, understand and handle its execution. 

 

It is usually very important to define the exact order in which packages should be executed. The order of execution is defined by sequence numbers, which are automatically assigned to all packages upon their creating and initialized to the value of zero (0). This means that, by default, all packages should be run in parallel. Sequence numbers are shown before the name of exach package on the dashboard, and can be edited by clicking on the Sequence Number button. Package execution begins from the lowest sequence number. Two packages with the same number will be executed in parallel.

 

Creating a package is an easy operation, due to the fact that many configuration details are automatically handled, allowing the user to concentrate on the main purpose of their package and the code that drives the process itself, instead of getting lost in the details and having a hard time with getting their package to properly execute without errors.

 

In order to make the package definition process easier, packages are divided into a set of predefined package patterns instead of using just one generic package template for all package types. All of these patterns are designed for different purposes and different parts of the data integration process. These patterns will be explained in the following paragraphs.

 

Data Flow Package Pattern

This package pattern is intended for the Extract and Load parts of the ELT paradigm. It allows the user to move data from one database to another without making any changes to the data during the process. When put in the context of data integration, this means that data can be extracted from the source database and loaded to the destination database, where it can later be transformed into a form suitable for the desired purpose.

The defined data flow package serves as a bridge between the source and destination connections. It is automatically added to the currently active project after it is created, and can be reused either by being added to another project or by being cloned and modified to match the new conditions.

In order to create a new parameter, it is necessary to choose the Package main menu option and then choose the Add Data Flow Package option.

 

The following data needs to be provided:

  1. Package Qualifier - the name of the package. This value is manually chosen by the user, and usually describes the main purpose of the package.
     

  2. Source Connection - the connection where the source data is located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  3. Source Query - the SQL query for selecting only the data from the source database which needs to be transferred to the destination. Allows the user to limit the scope of the source data without having to manually export the data to a new source database. It can include parameter values, in which case it is necessary to check the box for treating the query as an expression. The user can then choose to Evaluate the expression and check to which value it will be resolved when being executed.
     

  4. Destination Connection - the connection where the destination data needs to be located.It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  5. Database Destination - the definition of the schema and table to which the source data needs to be transferred. This information needs to be manually defined, unlike the source schema and table, which are determined through the source query.

The Package Name field is automatically computed and represents a combination of all important fields of the package. This helps make a diversity of different package names which can easily be identified without the necessity to use complicated naming systems.

Other options, such as copying batches of rows in parallel (Use Bulk Copy), deleting all data from the destination before inserting the new data (Truncate Destination) or performing UPDATE alongside INSERT statements (Use Identity Insert on Destination) exist, allowing the user to specify more details about their desired operation.

Execute SQL Package Pattern

This package pattern is intended for the Transform part of the ELT paradigm. It allows the user to alter the desired data located in the source database. When put in the context of data integration, this means that merge and partition statements can be created for Dimension and Fact tables. Most of the configuration is setup automatically, so that the user does not have to manually create all required statements, instead only configuring what should be merged/partitioned and in which way. This can save a lot of valuable time, as well as make the process less tiring and open to beginners and inexperienced users.

The defined execute SQL package serves as tool for modifying the existing data in the desired way. It is automatically added to the currently active project after it is created, and can be reused either by being added to another project or by being cloned and modified to match the new conditions.

In order to create a new parameter, it is necessary to choose the Package main menu option and then choose the Add Execute SQL Package option.

 

The following data needs to be provided:

  1. Package Qualifier - the name of the package. This value is manually chosen by the user, and usually describes the main purpose of the package.
     

  2. Source Connection - the connection where the source data is located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  3. Query - the SQL query for performing the desired transformation on the desired data. The desired transformation can be a set of SQL statements or stored procedures - any valid SQL code is valid. It can include parameter values, in which case it is necessary to check the box for treating the query as an expression. The user can then choose to Evaluate the expression and check to which value it will be resolved when being executed.
     

  4. Query Configuration - a non-editable field which indicates whether a predefined query configuration has been used for the selected package. Using a predefined query configuration means configuring a code generator that automatically fills the query field. This feature will be explained in detail in the next section.

The Package Name field is automatically computed and represents a combination of all important fields of the package. This helps make a diversity of different package names which can easily be identified without the necessity to use complicated naming systems.

It is also possible to allow showing the information on the number of data rows which have been transformed upon package execution by checking the Return Row Counts option.

Query Configuration

Query configuration allows the user to use an automatic SQL code generator for defining the transformation query of the Execute SQL package pattern. The code generators create prepared Dim and Fact table MERGE and PARTITION statements that would otherwise be very complex to write manually and would be prone to errors.

After a query configuration is setup, it can always be reviewed and changed by clicking on the Open Query Configuration button. The query field with automatically created statements will be non-editable, although the user can always copy the statements into a new package and modify them there if the need for it may arise.

The following code generators can be used:

  1. Query Config - Dim Table Merge (Standard):

    The purpose of this configurationis to transform the source data so that it represents the contents of a Dimension table. In order for the query to be successfully configured, dimension types need to be specified for all attributes by using the SnapMart eltSnap application for handling metadata. The purpose of the MERGE statement is inserting and updating data based on the contents of the associated staging table. It conforms to the Kimball Type 1 and Type 2 dimension models.

    The following data needs to be provided:

     

    - Basic information - includes specifying on which server and in which database the merge source and destination tables are located.

    - Dimension Column Naming Scheme - includes specifying which scheme for naming the Dimension table columns should be used. The scheme can be manually defined by the user (or the Standard predefined scheme can be selected).

    Merge Source (Staging Table) - includes specifying where the staging table is located.

    Merge Destination (Dimension Table) - includes specifying where the dimension table is located.

  2. Query Config - Fact Table Merge (Basic):

    The purpose of this configuration is to transform the source data so that it represents the contents of a 
    Fact table. In order for the query to be successfully configured, the metadata needs to be specified by using the SnapMart eltSnap application for handling metadata. The purpose of the MERGE statement is inserting and updating data based on the contents of the associated staging table. It conforms to the Kimball Type 1 and Type 2 dimension models.

    The following data needs to be provided:

    -
    Basic information - includes specifying on which server the merge source and destination tables are located.

    Merge Source (Staging Table) - includes specifying where the staging table is located.

    Merge Destination (FactTable) - includes specifying where the fact table is located.
     

  3. Query Config - Fact Table Partition (Standard):

    The purpose of this configuration is transforming one Fact table with a very large number of rows into several partitions, which can be queried faster and handled more easily, both when it comes to execution time and memory usage. This requires specifying the partition function and scheme, which need to be manually created by the user and which define the criteria of the partitioning. It is important for the partitioning column of the Fact table to be defined as the DATE type of data, and for the table indexes to be partitioned aligned. Otherwise, it will not be possible to execute the partitioning on the desired Fact table.

    The following data needs to be provided:

    Merge Source (Staging Table) - includes specifying where the staging table is located.

    Date-Based Partitioning Scheme - includes specifying the criteria for dividing the data into different partition segments.

    It is possible to delete all previously existing data from the switch-in table by checking the Truncate Switch-In Table on Load option.

    More information about this process can be found at: https://docs.microsoft.com/en-us/sql/relational-databases/partitions/partitioned-tables-and-indexes?view=sql-server-2017.

     

  4. Query Config - Fact Table Switch (Standard)​:

    The purpose of this configuration is switching Fact table partitions, which means moving partitions between Fact tables very quickly, hence reducing the overall time necessary for completing this process. It is important to have matching values for the source and destination partitions (identical columns, indexes, partition columns, schemes and functions, same file group), and for the switch-to destination partition to be empty prior to the execution of these statements.

    The following data needs to be provided:

    Partitioned Source (Switch-In) Table - includes specifying where the source partitioned Fact table is located.

    Partitioned Destination (Switch-To) Table -includes specifying where the destination partitioned
    Fact table for transferring the partitioned data is located.

    Partitioned Removal (Switch-Out) Table - includes specifying where the destination partitioned
    Fact table for transferring the removed data is located. The criteria (which is date-based) for the data removal also needs to be specified (predefined parameters such as today and yesterday can be used, as well as user-defined start_date or some other project-level parameter).

 

Foreach Data Flow Package Pattern

 

This package pattern is similar to the Data Flow package pattern, with one important difference: it is not limited to just one source database. This way, the user can connect to the place which defines which desired tables should be accessed, and when the package is executed, it will loop through all the source tables, extract them and load them to the destination. This means that the user would not have to define separate Data Flow packages for performing the desired data transferring, therefore making this process both easier to configure and execute.

In order to create a new parameter, it is necessary to choose the Package main menu option and then choose the Add Foreach Data Flow Package option.

The following data needs to be provided:

  1. Package Qualifier - the name of the package. This value is manually chosen by the user, and usually describes the main purpose of the package.
     

  2. For Each Connection - the connection where the definitions of different source tables are located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  3. For Each Query Expression - the SQL query for selecting only the source tables which need to act as the source. Allows the user to limit the scope of the source database tables. It is automatically treated as an expression, and clicking on the Evaluate Expression allows the user to check the value the query resolves to when being executed.
     

  4. Source Connection - the connection where the source data is located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  5. Source Query Expression - the SQL query for selecting only the data from the source database which needs to be transferred to the destination. Allows the user to limit the scope of the source data without having to manually export the data to a new source database. It is automatically treated as an expression, and clicking on the Evaluate Expression allows the user to check the value the query resolves to when being executed.
     

  6. Destination Connection - the connection where the destination data needs to be located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  7. Destination Schema and Table - the definition of the schema and table to which the source data needs to be transferred.

The Package Name field is automatically computed and represents a combination of all important fields of the package. This helps make a diversity of different package names which can easily be identified without the necessity to use complicated naming systems.

Other options, such as copying batches of rows in parallel (Use Bulk Copy), deleting all data from the destination before inserting the new data (Truncate Destination) or performing UPDATE alongside INSERT statements (Use Identity Insert on Destination) exist, allowing the user to specify more details about their desired operation.

Foreach Execute SQL Package Pattern

 

This package pattern is similar to the Execute SQL package pattern, with one important difference: it is not limited to just one source database. This way, the user can connect to the place which defines which desired tables should be transformed, and when the package is executed, it will loop through all the source tables and perform the desired transformation This means that the user would not have to define separate Execute SQL packages for performing the desired data transformation, therefore making this process both easier to configure and execute.

In order to create a new parameter, it is necessary to choose the Package main menu option and then choose the Add Foreach Execute SQL Package option.

The following data needs to be provided:

  1. Package Qualifier - the name of the package. This value is manually chosen by the user, and usually describes the main purpose of the package.
     

  2. For Each Connection - the connection where the definitions of different source tables are located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  3. For Each Query Expression - the SQL query for selecting only the source tables which need to act as the source. Allows the user to limit the scope of the source database tables. It is automatically treated as an expression, and clicking on the Evaluate Expression allows the user to check the value the query resolves to when being executed.
     

  4. Source Connection - the connection where the source data is located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  5. Query Expression - the SQL query for performing the desired transformation on the desired data. The desired transformation can be a set of SQL statements or stored procedures - any valid SQL code is valid. It is automatically treated as an expression, and clicking on the Evaluate Expression allows the user to check the value the query resolves to when being executed.

The Package Name field is automatically computed and represents a combination of all important fields of the package. This helps make a diversity of different package names which can easily be identified without the necessity to use complicated naming systems.

It is also possible to allow showing the information on the number of data rows which have been transformed upon package execution by checking the Return Row Counts option.

Execute Process Package Pattern

 

This package pattern is intended for performing data integration operations that require invoking external processes. The processes which are most commonly invoked are cmdpowershell and python. Due to the fact that any process is invokable through the Command Prompt, this makes the possibilities of this package pattern limitless. Therefore, any desired external operation can be performed by using this package pattern, by simply defining which process needs to be invoked, in which directory, and with which arguments.

In order to create a new parameter, it is necessary to choose the Package main menu option and then choose the Add Execute Process Package option.

The following data needs to be provided:

 

  1. Package Qualifier - the name of the package. This value is manually chosen by the user, and usually describes the main purpose of the package.
     

  2. Process - the name of the process which needs to be invoked. All processes are invoked by using the Command Prompt CLI.
     

  3. Arguments - the arguments that will be passed to the process when invoking it. The process and arguments expressions will be concatenated so that they represent a single command invoked through the CLI.
     

  4. Working Directory - the directory in which the process will be invoked. Only manual input and working directory type parameters are supported.

The Package Name field is automatically computed and represents a combination of all important fields of the package. This helps make a diversity of different package names which can easily be identified without the necessity to use complicated naming systems.

It is also possible to add a Query Configuration, if the process is aimed at data transformation, Store Runtime Values in ELT_Framework Database, or manipulate Variables tied to the specified package.

JSON Table Transfer Package Pattern

 

This package pattern is aimed at transferring data between databases with flat files in the middle or from flat files to databases and vice versa. Due to the fact that database data can be exported to flat files, this package pattern adds support for handling the link between exported data and the database that serves as a source/destination. The flat files need to have the JSON extension, which is the most commonly used for exporting and importing data. Three different scenarios are supported:
 

  • Transferring tables from a database source connection to a flat file destination connection;
     

  • Transferring tables from a flat file source connection to a database destination connection;
     

  • Transferring tables from a database source connection to a database destination connection with exporting the data to a flat file connection during the process.

 

In order to create a new parameter, it is necessary to choose the Package main menu option and then choose the Add JSON Table Transfer Package option.

The following data needs to be provided:

 

  1. Package Qualifier - the name of the package. This value is manually chosen by the user, and usually describes the main purpose of the package.
     

  2. Source Connection - the connection where the source data is located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  3. Table Select Options - allows the user to manually choose whether they want to include all tables from the source connection or not. Due to the fact that selection of chosen tables needs to be manually made, it is recommended to use the Except option in case when only a small number of tables will be excluded from the export.
     

  4. Table List - allows the user to choose which database tables they wish to include in the export.
     

  5. Flat File Connection - the connection where the source/destination flat file is located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.
     

  6. Destination Connection - the connection where the destination data needs to be located. It is recommended to pre-configure the connection and choose it from the drop-down list, although it can be added directly from the package form.

The Package Name field is automatically computed and represents a combination of all important fields of the package. This helps make a diversity of different package names which can easily be identified without the necessity to use complicated naming systems.

 
Environments

 

Environments are eltSnap objects which allow the user to define different conditions under which their projects will be executed. This includes defining different template groups with different parameter values for all different project execution conditions. Environments are usually used for defining the devtest and prod execution conditions, as they are the most popular and common project stages which require different conditions for project execution.

Different environments can use the same template group, but different values for template group parameters (i.e. the user can define receiving alert and informational emails for two environments, but specify two different email addresses to which the emails should be sent). This makes it possible to execute one project under different conditions, without having to clone the project or redefine it with different parameters.

 
Execution

 

After configuring all eltSnap objects, placing them in projects and defining the sequence numbers of the packages, the projects can be executed, allowing the defined data integration steps to take place. Packages can also be executed independently of the projects they belong to, depending on what purpose they have and what the user wishes to do.

There are two options for execution:

  • Execution directly through the CLI;
     

  • Execution through the eltSnap Desktop application.

If the user wishes to execute their project/package without using the desktop application, they need to open the Command Prompt, change the active directory to the directory in which eltsnap runtime is installed, and enter the command which invokes the execution. The command for the specific project or package can be found by clicking on the Runtime button.

Control Panel

It is important to be able to monitor the execution after it is started, due to the time sensitivity of the processes and the need to fix any errors as soon as possible. Informational messages about the execution can be seen in the CLI (if the execution is invoked through the CLI) or in the Progress Log (if the execution is invoked through the desktop application). However, these messages are only visible after the execution is finished, which might not be convenient for projects which take a long time to finish executing.

This problem is solved by using the Control Panel eltSnap application. This application enables the user to monitor the execution in real-time, being informed about how many  packages have finished executing, what packages is currently being executed, being able to abort the execution and much more. The Control Panel can be used independently of eltSnap Desktop, meaning that monitoring of CLI-invoked executions is also supported.

When the Control Panel is first opened, the list of the projects currently being executed is empty. The list is updated every time the Refresh button is clicked. It is not automatically updated due to the fact that the time it takes to execute a package usually does not require automatic refreshing every couple of seconds, which would just unnecessarily put a strain on the usage of resources of the user.

The most important information shown in the Control Panel includes:

  • Concurrently Running Projects - the number of projects for which the execution has been invoked, and which are being run at the same time.
     

  • Queued Projects - the number of projects waiting for execution start. A project is queued when its execution is invoked although it is currently being run, which initiates the reset event (the execution is stopped and then initiated again).
     

  • Packages Running - the number of packages which are currently being run. Indicates whether multiple packages are being run in parallel.
     

  • Packages Waiting - the number of packages for which the execution has not been invoked yet, due to their sequence numbers in the currently active project.
     

  • Packages Completed - the number of packages for which the execution has been successfully completed.
     

  • Estimated Completion - the estimated time left before the project execution is completed. The value of this field empty when the project is run for the first time, and is calculated for every next execution based on the average execution time of previous project executions.
     

  • Total Rows Affected - the cumulative number of rows affected for all packages contained in the project. Is updated only for packages which enable this feature.
     

  • Status - indicates whether the project is waiting in queue, currently running or if its execution has stopped (successfully or due to an error). The user can force stop the execution of a project which is currently being run by clicking on the Stop button.

 
Change Connection

 

The initially configured connection to the eltSnap databases can be changed at any moment. This will result in the update of eltSnap configuration files. It is also necessary to run the eltsnap database scripts on the new target database, in order for eltSnap applications to be able to successfully retrieve and update the configuration information.

In order to change the currently active connection to the eltSnap configuration database, the user needs to click on the Change Connection button located at the right lower corner of the dashboard. The currently active connection is always shown at the bottom part of the dashboard.

The following data needs to be provided:

  1. Server Details - allows the user to specify the name of the target Microsoft SQL Server database server. The list of supported eltsnap databases is updated every time new server information is entered.
     

  2. Authentication - allows the user to specify the authentication method used to getting the credentials necessary for connecting to the desired server. This field should match the value of the field of specifying the connection through the SSMS.
     

  3. SQL Server Credentials - allows the user to specify the username and password necessary for connecting to the specified server.

2019 BI Tracks, LLC

  • Facebook - White Circle
  • Twitter - White Circle
  • LinkedIn - White Circle