Loginom 6.4 - What's new?

Major Update of Loginom Analytical Platform Functionality of users and packages monitoring was implemented. Python support was added. The existing algorithms were optimized. New handler and visualizer appeared, whereas the existing ones were upgraded.

Many changes and new options were introduced into the new updated version. In our opinion, the most important changes are listed below:

  1. Python is supported.
  2. Generation of the output data set from the code in the JavaScript component.
  3. Export to the Data Warehouse component and inheritance mechanism for Import from Data Warehouse component.
  4. Duplicates Detection Handler.
  5. Setting weights and model calibration of Logistic Regression.
  6. Monitoring of active users and busy packages.

Programming languages support

Python

In Loginom 6.4 it became possible to embed the calculations made using the Python programming language code into the workflow. CPython is used as the runtime environment, and the Python component operation in the workflow is identical to the JavaScript component operation. Data transformation to Pandas DataFrame structure and vice versa is supported. The popular libraries used for Python data analysis are available for import: NumPy, Pandas, Scikit-learn, etc.

Use of the Python based scripts

Use of the Python based scripts

There are no development tools in Loginom that are similar to PyCharm and other specialized products. The main Python use case is to imbed the Python calculations into the existing data analysis workflow.

Taking into consideration the fact that the component executes the Python code on server, there are several restrictions concerning its use:

  1. It is not allowed to run the commands to work with GUI (graphic user interface).
  2. The code called in the Python component is executed in a single thread, and only one Python node can be simultaneously executed. The Python node activated in the process of another Python node execution will wait for completion of the first node operation even if it does not depend on it.
  3. Multiprocessing package modules used for parallel operation by creating several processes and also third-party packages dependent on them are not supported.
  4. If critical errors occur while executing the Python code, the application can be crashed. In this context, it is recommended to use the Python code debugged in advance.
  5. When converting data set into Pandas DataFrame structure, the whole processed data set is used that can sometimes cause considerable slow-down of the workflows operation.

There are also several safety issues. As opposed to JavaScript, the Python operation is not restricted by the runtime environment and the executed code has full access to the operational system and network protocols. In this context, the administrator must provide the definite possibility to use the Python component on server. The whole code is executed on server with rights of the operational system user under whose id Loginom Server is started.

Activation of the Python execution feature

Activation of the Python execution feature

Python 3.4 and higher and also all packages to be used in the operation process must be installed on server for the Python component operation. For more detailed information possible to contact the Technical Support Service.

JavaScript

Opportunities of the JavaScript component use were significantly expanded, and several errors that could previously occur were corrected.

It became possible to set the output data set structure from the code. Such behaviour could be required if the data structure is not known in advance and it is generated in the node execution process. For example, when analyzing the web service response. The following methods can be used to generate a set of the output columns:

  • AssignColumns — creation of the output data set columns from the collection of columns names/descriptions.
  • AddColumn — addition of a column to the list end of the output data set columns.
  • InsertColumn — insertion of a column into the output data set by the set index.
  • DeleteColumn — deletion of a column by name or index.
  • ClearColumns — clearing a list of columns.
It is possible to set the output fields structure directly from the code

It is possible to set the output fields structure directly from the code

The errors connected with incorrect indexation of rows that can cause the following error while JavaScript code executing: "Row number 0 is out of [0; -1] range", or incorrect displaying of a data set in the preview window were corrected.

Data Warehouse

When importing from the Data Warehouse, it became possible to display data in the hierarchal (tree-like) structure that can considerably simplify the data import configuration if the warehouse had a complex structure.

Tree-Like Structure when Importing from the Data Warehouse

Tree-Like Structure when Importing from the Data Warehouse

Inheritance possibility for Import from the Data Warehouse was added. It can be useful if connections to one and the same Data Warehouse with different configuration parameters are used in several places. In wizards of connection to the Data Warehouse on MS SQL and Oracle DBMS it became possible to specify the name of the schema in which metadata and DW data are located.

It also became possible to export to the Data Warehouse. Now Loginom enables to perform the whole cycle of the ETL operations connected with the Data Warehouses.

Configuring export to the Data Warehouse

Configuring export to the Data Warehouse

Duplicates Detection

The component that can significantly simplify analytics, the first stages of work with Data quality. For operation, it is required to set the input and output data sets of fields, and then the component automatically detects full doubles and contradictions in the data set. In this case, the doubled rows are considered to be the rows with exact match of the compared fields.

Duplicates and Contradictions Visualizer

Duplicates and Contradictions Visualizer

Logistic Regression

Two important changes that enable to expand possibilities of the Logistic Regression use and in some cases to increase the quality of the received models were introduced into Loginom 6.4.

  1. Setting weights for separate records used while model training provides more accurate model if it is required to consider significance of a separate record in the data set. For example, such approach can be used to train the model of risks assessment in the sphere of crediting when the credit sum can be set as the weight.
  2. The model calibration using the priori probability correction enables to correct distortions that can occur at the model training stage when balancing the classes. Events to non-events ratio can be automatically set according to the training or test set data and it can be also manually set in the wizard or interactively in Binary сlassification assessment visualizer.
Setting weights for records while Logistic Regression configuring

Setting weights for records while Logistic Regression configuring

Configuration of events to non-events ratio correction

Configuration of events to non-events ratio correction

REST request

Support of PUT, DELETE and PATCH methods was added in REST request connection wizards. Previously, only POST and GET methods were available. Such methods could be required for operation of some web services.

Previously, http status of the response received from the server was not taken into account, and its value was returned in none of the output ports. The column that contains the web server response code was added in the output data set - "Additional data". The column contains integer values with HttpStatusCode name and "HTTP Status Code" caption.

Loginom Server

In some cases, the identifier value of the request received when executing the workflow node by means of Loginom Integrator could be required in the workflow. Previously, it was not possible to get and use this identifier. Now RequestId variable was added to "Workflow Variables" node to "Session variables" port. Its value matches either null value by default, or the request identifier value when executing the workflow node by means of Loginom Integrator.

Administration

In the teamwork process or when executing the workflows with complex dependences, it frequently happened that the server notified that the required package was already used. And it was not possible to define the user or process that used the package. Besides, sometimes it was required to clear cache of the packages published by means of Loginom Integrator forcedly, for example, when updating versions of several packages at once.

To tackle these tasks, "Session Manager" that can be used to view the opened sessions and packages and for their monitoring and activity control was added to Administration section. It is also possible to stop the workflow execution forcedly or to reset the user session in this section.

Session Manager Bar of the Administration Section

Session Manager Bar of the Administration Section

When debugging occurring errors, sometimes it was required to enable detailed logging of the occuring errors. For this purpose, it was required to restart Loginom Server placing the special file for debugging information record with the executable file. Now it can be done in the server settings and desktop applications. Special parameter "Trace exceptions" was added for this purpose. Server restart is not required in this case but parameter is enabled with a slight delay.

Databases Operation

Check of connection string and required parameters was added to the connection wizard. PostgreSQL 12 support was added.

The package export of the strings with the length of 9 and more characters to the fields of BLOB SUB_TYPE TEXT type to the Firebird database was corrected. The following error was corrected: "value" column relates to the money type, and expression - double precision". It occurred while trying to export real numbers to the money field of the PostgreSQL database. The following error was corrected: "value" column relates to the time with time zone type, and expression - timestamp without time zone". It occurred while trying to export the values of the Date/Time type to the time with timezone field of the PostgreSQL database.

Cube

It became possible to display several measures simultaneously in the chart graph. Besides, the series shown in the graph can be activated and deactivated.

When filtering by unique values, restriction in displaying not more than 10000 unique items was removed in the Cube. The similar restriction was removed for Row filter. Previously, it was rather inconvenient to set filters by many records, for example, SKU.

The following less significant but still important improvements were also achieved:

  • it became possible to change the series color in the chart graph;
  • the context menu was introduced into the chart;
  • group control of series visibility in the chart was added: "Show all series", "Hide all series", "Invert series".

Error of zeros removing at the beginning of string fields when exporting from the Cube to Excel was corrected. Frequently it was particularly inconvenient if the fields with item numbers that started from zeros occurred in the exported data sets.

Coarse Classes

When supplying data to previously trained model, it became possible to compare distribution of actual bin rates with the training ones in Coarse сlasses visualizer. It enables to estimate compliance of the data used when model training and using. If significant inconsistency was detected, it is required to retrain the model.

Comparison of distribution of actual bin rates with the training ones

Comparison of distribution of actual bin rates with the training ones

Variables names and descriptions of rates of events and classes were unified.

Batch Processing

To start the node in the Batch processing mode, it is required to set its name, and the node must not be compulsorily published. Previously, it could be done only in the following way: it was required to select the "Published" status in the access modifier configuration window. Then the node name was set and the previous status was returned. Now it became possible to set the name for the model node with the "Public" access modifier. To reset the node name, it is sufficient to leave the field blank (it is possible only for the "Public" access modifier).

Other Changes

The animated icon that shows the status of processes was added in the case of active processes.

Group editing of the data format became possible in the table.

The automated end-to-end testing was added that enabled to enhance the application reliability significantly.

Nodes locking could occur earlier while cutting/inserting the nodes that could cause errors. The locking mechanism became more reliable.

Functionality that enabled to show lag number (offset value) was added for a user to the Correlation Analysis handler when calculating extremum of cross-correlation function.

Names of models parameters were corrected in the Linear and Logistic regressions reports, missing parameters were added to the Linear regression report, names of methods and factors of models selection were added.

It became possible to replace null numerical values with 0 in the Imputation handler, and the handler operation was optimised, Unspecified option was removed from the list of the methods allowable for the Logistic data type.

Tips with full text displayed when hovering a cursor over the field or field caption were added.

The lower bar with the only "Close" button was removed in the quick view window. Instead, the window can be closed by clicking on any space outside the quick view window or by pressing «×» (window close button).

"Scripting" group of components was added (includes Python and JavaScript).

Earlier detected errors were corrected, operational stability of the application was increased, memory consumption was optimised.

#loginom#release notes#6.4#6.4.0

See also

Missing data imputation
In practice, missing data are very common in real data processing. The reasons may comprise data entry errors, information hiding, or fraud. In this article, we will discuss in which cases...
Loginom 6.4.2 release notes
In this update, special attention was paid to the platform usability, performance improvements related to complex workflows operation were made, and some fixes were performed.
Loginom 6.4.1 release notes
Several errors detected when testing the features added in version 6.4.0 were corrected in this version. Operation of SOAP services and connections to databases was corrected.