En ba db modernization
From The Real BLU AGE® Wiki
Data and database Modernization Reference Manual version V3.6 September 2011
Introduction
Blu Age Data and dataBase Modernization (DBM) helps you to modernize any type of database. Its main goal is the modernization of the databases of old applications (non-normalized relational databases, file databases,XML databases, …) which naturally implies the re-normalization of schemas and the complete integration of Blu Age products for the modernization of legacy software, both of which contribute to the modernization of both the database and the code that uses it. DBM thereby allows for the generating of scripts that can create the new relational database, of scripts that transfer the data - to be used with Talend Open Studio (www.talend.com), and of the UML2 model that can be exploited thanks to Blu Age V3.6 and beyond. What's more, DBM can also be used to maintain these modernized bases, thanks to the management of different SQL artifacts or the adding of tables, columns, indexes, constraints, etc...
This tools will be particularly useful for :
- Data Base Administrators,
- Functional Domain Experts,
- Developers,
- Software and Information System Architects,
- Modernization experts.
A dedicated training is available. For any detailed information please refer to www.bluage.com. This entire document is protected by French and US laws specially for what is regarding ownership and intellectual properties. All rights of intellectual and industrial property (copyright, patents, trademarks), including in particular the exploitation rights, copyrights and extraction in any medium, of some or all data, files and all Material on the Web pages on this site, and representation rights and copyright in any medium, of all or part of the document itself, the rights of modification, adaptation or translation, are reserved exclusively for Netfective Technology - and its potential third party licensors. In addition, all trademarks and logos represented on this site are property of their respective owners.
Getting Started
Installing
Installing the reverse DBM tool is done by executing the downloaded executable. Required prerequisites: a JDK 1.6 must be installed along with Talend 4.2 (http://www.talend.com).
Launching
The reverse application is launched via dbm.exe or a shortcut to the executable. After the initial loading, you are requested to specify the location of your workspace.
Choose a location for the first launch. Whenever starting the software later, the last workspaces that were used will be suggested as choices in the selection tool.
Main Icons
: Allows for the complete reduction of the tree structure
: Allows one to link the current editor to the tree structure
: Whether to display the categories or not
: Creating a connection profile
: Opening an SQL "scrapbook" (query and execution editor)
: Displaying elements by their "databaseName" (modernized name)
: Displaying elements by their "modelName" (name in the UML2 model)
: Starting the registry of a macro
Choosing a perspective
Once the application is launched, go to : Window > Open Perspective > Other… The choose the "Database Development" perspective.
It is also a good idea to open the "Package Explorer" view so as to make it easier to look at open projects.
(Window > Show View > Other… then Java > Package Explorer)
Initializing the DBM Project
Creating a repository
The first step consists in creating a repository. To that end, click on the
icon of the "Data Source Explorer" view. The following pop-up appears :
There are four ways to create repositories.
Create
"Create a Workspace Based Connection Profile Repository" allows you to create a new repository. One then has to configure the connection not only to the source database but also to the modernized database (see ).
Once it is validated, three connections appear:
- "Connected" corresponds to the connection to the source database that will be used to visualize its structure as well as make queries to its data. (see the SQL Editor).
- "Legacy" is initially identical to "Connected", but this time it becomes the working connection where all the modernizations are going to be done. The connection properties allow a first filter (tables, schemas, etc…) and then allow you to shift into offline mode (right-click > Work Offline). This operation consists in locally recovering the entire model of the database, in light of future modernizations.
- "Modernized" : connection to the modernized database. It allows one to execute the DDL scripts for the creation of artefacts in the new database and to handle requests once the transfer has taken place (see SQL editor).
Executing this wizard will also lead to the creation of an Eclipse project that corresponds to the repository. (the link is made using the name that is common to both artefacts). This project will contain the folders that correspond to the "legacy" connection, the connection information, the resulting artifacts (UML model, audit reports, Talend scripts), the modernization routines being used, the generated SQL scripts, etc…
As far as collaborative work is concerned , this is the project that is sent to the sources manager. Once the project is recovered, the repository can be recovered by using the "Import" wizard.
The Package Explorer view allows one to visualize the different projects whereas the Data Source Explorer view allows one to visualize the different repositories.
The GlobalRefactorings folder will contain all the refactorings of global type that have been created.
The SingleRefactorings folder will contain all the refactorings of unitary type that have been made.
The Stereotypes folder will contain all the stereotypes used in this repository.
Those 3 elements will be further described later in this document.
The Sequences folder (visible or not according to the type of database) contains all the sequences for that schema.
The Stored Procedures folder will contain all the registered procedures.
The Tables folder will contain all the tables for that scheme
Modernize
"Modernize An Existing Workspace Based Connection profile Repository", will allow you to do the modernization work iteratively.
The idea is to cut up the problem into smaller parts so as to be able to handle them better in isolation. Oftentimes, the first modernization will involve renaming and management of types or dates. Following up on that first modernization, we would look at adding new SQL artifacts (foreign keys, index, etc…) to the repository that was made during the modernization of the first modernized base.
In fact, at each new step we are going to modernize the database produced in the previous step. This wizard can just as easily allow you to modernize a database that is under production.
The resulting Legacy connection will therefore correspond to the modernized connection of the repository that is currently being modernized. One still has to Configure the connection to the new modernized database (oftentimes the same database but with a different schema).
A new type of connection appears : "Summary":
In connected mode, this will show the connection to the first iteration repository of the modernization project.
In offline mode, it corresponds to the sum of all the modernization modifications from the first iteration repository to the last modernization step, with the goal of achieving high tracability and to allow the production of scripts for a complete modernization.
Import
A wizard that is mainly used during the recovery of a modernization project from a source manager : the project must already exist in the workspace – select it to start the import. To share a repository from within a source manager, one need only add the project corresponding to the repository (the names are identical) to the source manager.
The connection profile name will automatically be the same as the imported project.
Merge
This wizard allows one to unify two distinct modernization respositories. It can be used when the source database is made up of heterogeneous data support.
Connecting to a relational database
Choose the type of connection from among the available choices. The connection will automatically be named after the repository.
Whenever connecting for the first time to a database of a certain type, it will be necessary to configure the appropriate drivers (base versions, the corresponding JAR files, etc…).
Once the driver has been defined, all that is left is to fill in the connection parameters in the usual way. The Test Connection button then allows you to make sure that no errors have been made.
Connection to a legacy Pacbase database
This type of connection requires the BRM tool (BluAge Reverse Modeling) and a corresponding extraction project.
Choose "Pacbase Connection Profile Wizard", Fig 2.
The first element of the next screen is the path that needs to be set to the "vap.ecore" file that needs to be used. This file is usually found in the BRM project in the reverse > PAFAssociationsPosAbs2VAP folder.
The second element determines whether or not this is a batch project.
All that is left to be done is to create the connection to the modernized base in the usual way.
Nevertheless, it will be necessary to shift the "Legacy" connection into offline mode to start to work on the modernization of data even if the filters do not work for this type of connection.
Filtering of connections
When we establish a connection to a base, all of the schemas and tables are recovered. If you only want to keep certain elements : use the properties of the connection to parameterize the filter for those schemas or tables. It is important to go through this step before shifting the "Legacy" connection to off-line mode, so as to be left with only the necessary elements. This step is useless in the case of PACBASE connections. Indeed, for connections of type PACBASE, unnecessary elements are directly deleted
These filters can also be defined for "Connected" and "Modernized" connections if need be.
"Connected" and "Modernized" Connections
The "Connected" and "Modernized" connections allow one to access the content of these schemas (sequences, tables, views…). The main purpose of these connections is to be able to visualize the structure of the base as it first comes in and the base we get at the end and to be able to examine its data.
From the tree structure, it then becomes possible to visualize and edit the data from a table (Data > Edit), load data from a file (Data > Load…), extract the data into a file (Data > Extract).
"Legacy" Connection
The "Legacy" connection, once it is in offline mode ("Work Offline"), allows additional elements to appear in the schemas, most significant of which are the rehandling (or refactoring) rules and the stereotypes. The contextual menu (right-click on an element) allows different actions depending on the selected element. It is possible to create an element (table, column, constraint, stereotype, …), to launch refactorings (refactoring > …), to generate the script for the creation of a database (Generate DDL), to generate Talend scripts (Transform to Talend) or the analysis files for the base (Generate Audit Reports).
To make things easier, it is possible to visualize the elements with their modernized names instead of their original names, either using the future base name (databaseName) or in the UML model used by Blu Age (modelName), accessing them directly in the tree structure. One need only click on the
icon to see the databaseName or
to see the modelName.
Data Base Audit
Generating the analysis
DBM allows you to analyze all the elements of the database based on a schema in the "Legacy" connection. This functionality is especially useful when measuring the extent of the changes that need to be made to the initial base, at the start of the modernization. It can also be used at the end of the modernization to check that there are no more problems left to address.
The result of the analysis presents itself in the form of Excel files. Each Excel file is linked to an audit rule. These are logged in the project that corresponds to the repository, as a function of the strictness of the rule. They can therefore be seen from the Package Explorer view.
The audit rules
Error
- "PK_With_Nullable_Columns" locates all the primary keys for which at least one of the columns that make it up accepts the null value, which is not coherent for a primary key.
- "Repeating_Group_With_Different_Type_Definition" classifies the columns of a single table that have similar names, as in a kind of list (CLIENT1, CLIENT2, CLIENT3,…) but of which the types may be different. While this may be an error or be intentional, it will become necessary to homogenize these types to be able to modernize the data together.
- "Repeating_Group_With_Same_Type_Definition" classifies the columns of a single table that have similar names, as in a kind of list (CLIENT1, CLIENT2, CLIENT3,…) and which are all of the same type. This list will have to be factorized into the column of a daughter table.
- "Similar_Columns_Not_Linked" lists the columns from different tables that have the same name and the same type but which are not linked to each other (no foreign key). This is typical of data redundancy.
- "Unique_Indexes_With_Nullable_Columns" locates all the unique indices that have a column that accepts null values, which is not coherent with a unique index.
- Performance
- "Index_Columns_Contained_By_Another_Index" indicates all the indices whose columns are already indexed by another index. The index is perhaps useless.
- "Index_With_Too_Many_Columns" shows the indices that index too many columns (4 or more). This can lead to worse performances.
- "Table_With_Too_Many_Indexes" shows the tables having an overly high index (5 or more) Too high of an index can ruin the performance.
- "Table_Without_Index_On_FK" classifies the talbes having foreign keys whose columns are not indexed. It is oftentimes prefereable to index them to avoid performance issues.
- Warning
- "Non_Unique_Index_With_Unique_Behavior" locates the unique undeclared indices whose columns contain the primary key. These indices should therefore be unique.
- "Similar_Columns_With_Different_Types" classifies all the columns from all the tables that bear the same name but not necessarily the same type. This most likely stems from some kind of redundancy but with a type error.
- "Table_With_No_Alternate_Key" lists the tables that only have a single primary key made up of a column that is of the numerical type and which is incremented automatically. These tables have no business keys.
- "Table_With_No_Relationship" indicates the tables that have no link to any other table. This table is probably useless in this scheme.
- "Table_Without_PK_Candidate" shows all the tables that have no constraint that guarantees the uniqueness of the table recordings.
Validating elements of the base
Launching a validation
The validation rules allow you to know whether the base is correct, by taking into account the modifications that were made during the modernization. It is possible to launch a validation on any element of the base. The validation will also be applied to the elements contained within the validated element.
The validation errors appear in the "Validation View", Fig 2.
It is also possible to filter the rules that are used (Window > Preferences > Model Validation > Constraints > SQL Constraints.), Fig 3.
Handling validation errors
From the "Validation View", a few choices are available. From the validation view, double-clicking on a line allows you to open the incriminated element so as to go through the right changes with assistance. On the contrary, the contextual menu allows you to launch the refactoring that is tied to the validation rule that corresponds to the error (the refactoring must explicitedly be connected to the validation rule, for more details see Refactoring) , ignores the error, or eliminates the element that produces the error (if it is possible).
Validation rules
SQLObject
- "ModelNameValidation" produces an error if the element does not have a Model Name or if another element in the same container already has the same name.
- "DatabaseNameValidation" produces an error if the element has no Database Name or if another element in the same container has the same.
Table
- "PrimaryKeyValidation" yields an error if the table does not have a primary key.
- "TableWithNoRelationshipValidation" yields an error if the table has no relation to the other tables.
- "TooManyIndexesValidation" yields an error if the table contains too many indices (6 or more.
- "TableWithoutPKCandidatesValidation" yields an error if the table has no means of guaranteeing the uniqueness of the recordings. It has no unique constraints and no unique index.
- "RepeatingGroupWithSameTypeValidation" yields an error if the table has columns that have names that form a sequence and whose types are the same. One must factorize these repetitions into the column of a daughter table.
- "RepeatingGroupWithDifferentTypeValidation" yields an error if the table has columns that have names that form a sequence and whose types are not the same. The types must be homogenized.
- "TableWithNoAlternateKeyValidation" yields an error if the table only has one primary key made up of a column of the numerical type and which does not have a business key.
Column
- "SimilarColumnsNotLinkedValidation" yields an error if columns with the same name in different tables have no relation that connects them (the talbes have no foreign key that connects them).
Primary Key
- "PrimaryKeyColumnsNullableValidation" yields an error if all the columns that make up the primary key accept null values.
Foreign Key
- "TableWithoutIndexOnFKValidation" yields an error if the columns of the foreign key are not indexed.
Index
- "UniqueIndexColumnsNullableValidation" yields an error if the columns of the index all accept null values.
- "NonUniqueIndexWithUniqueBehaviorValidation" yields an error if the index is not declared to be unique and nevertheless contains the columns of a unique constraint.
- "IndexWithTooManyColumnsValidation" yields an error if the index has too many columns.
- "IndexColumnsContainedByAnotherIndexValidation" yields an error if the columns of the index are already indexed by another.
- "IndexWithAllColumnsToDeleteValidation" yields an error if the columns of the index all need to be eliminated.
Routines
Creation
A routine is a java class that is used to generate specific processes during the transfer. They enable the manipulation of data from columns. The routines must be created in the package that has been built for that purpose within the project (src > routines). The project refers directly to the standard Talend libraries that can therefore be used in the routines that are created.
For example: the code of a routine containing a function that recovers a boolean in the form of a chain of characters and returning an integer corresponding to the boolean (1 = true, 0 = false).
Use
The routine can be called from a table or column as follows: "routines.name_routine.name_method(parameters)"
With the previous example, we can use the routine on a column that is stereotyped as "boolean" and whose original type is "VARCHAR(S)" and whose modernized type is "NUMBER(1)" . During the transfer of data, the column expression will be used.
SQL Editor
Creating a Scrapbook
Click on the
icon. A tab with connection profile parameters appears. To give more information about the base with which the script will be working, fill in the information about the type of connection, the name of the connection and the database via the scrolling menus.
By merely choosing the right connection name, the right type of connection will be selected directly.
Writing and launching queries
When the connection profile has been filled, the autocompletion (Ctrl + space) facilitates the construction of SQL queries. The queries can be launched individually or collectively, either with the contextual menu (right-click) or with corresponding keyboard shortcuts, Writing and launching queries, Fig 1.
The result is displayed in the "SQL Results" view, Fig 2.
The "Status" tab indicates whether or not the query has been successful. If not, the reason for the failure will be displayed. The "Result" tab displays the result of the query. The entire history is stored and allows one to visualize the results from previous queries.
Producing DDL
The production of database scripts is launched from the contextual menu of the base. This opération can be launched at any level of the tree structure. The generated artefacts will depend on the element on which the wizard has been launched, Producing DDL Fig 1.
In the pop-up that appears, one must first define the format of the instructions that need to be followed, Fig 2.
Then, one must define the types of SQL artifacts that need to be created. It is advised that you only create different constraints once the data has been transferred, Fig 3.
The instructions appear in the pop-up. You need only choose whether you want the instructions to be launched directly (check "Run DDL on server" ) or displayed by the SQL editor (check "Open DDL file for editing"). You can, of course, check both, Fig 4.
If you checked "Run DDL on server", you are asked to choose a connection. If no connection has yet been defined, it is necessary to make one via the "New…" button (When there is no defined connection, one have to be defined using the "New…" button (see ).
All you have left to do is to validate by clicking on "Finish".
If you selected "Open DDL file for editing", the instructions are displayed on the SQL editor. You can then edit and execute the script as you wish (see Writing and Launching Queries)
Transfer Scripts
Generating Scripts
Important : Before starting the transfer, the elements of the modernized base must be created (see Generating DDL). Nevertheless, you are strongly encouraged to wait until the data has been transferred to start adding constraints to the tables.
The contextual menu of the base allows you to create scripts for data transfer. One must select "Transform to Talend", Generating Scripts Fig 1.
A project with transfer scripts (or jobs) is created and can be seen in the "Package Explorer" view, Fig 2.
Talend Open Studio (transfer script)
Launch
Launch Talend Open Studio (if necessary, consult the Talend documentation, [1]).
Choose a workspace ("Referential") or make one via the "…" button.
Import the project generated by DBM via the "Importing existing project(s) locally" and click on "Ok". Select the "TALEND" project that was generated previously.
Approve by clicking on "open".
Principles
The transfer is done in multiple steps defined by the types of jobs that are created. There are three types of jobs, distributed over the following folders:
- Migrations: Transfer the data into temporary tables that contain the data from both legacy and modernized tables. You must have Legacy and modernized data for the transfer of data onto foreign keys (in the case where a foreign key gets values from a column, and the values are then deleted).
- ForeignKeys : Enhances the temporary tables that deal with added foreign keys.
- Finalizations : Transfer the data from temporary tables onto final tables.
The jobs will be executed in the following order : Migrations, ForeignKeys, and Finalizations. There exists a last job called "Migration" that calls all the jobs in the right order. The goal is to execute all of it in a single job.
Launching a job
To start a job, simply click on "Execute" in the "Execute" view. You can launch multiple jobs in parallel to accelerate the whole process. Otherwise, you can just launch the "Migration" job and wait till it's all done. You can of course open jobs, view them and even edit them. In that case, you are highly encouraged to propagate your edits to the DBM format so that your changes do not get lost if scripts ever have to be re-generated.
Context Variables
It is possible to create context variables that will be used in your jobs. In the "expression" attribute of columns or tables, just use the "context.[variable name]" and a context variable will be created in the Talend "context" file. Example : parametrizing the limited number of transferred recordings, dynamical parametrizing of information, etc….
This variable will then be taken into considération during the job execution.
Connection Parameters
You can redefine the connection parameters for the source and the modernized base.
Just open the file you're interested in and redefine the desired parameters on the second screen.
Custom components
When you use the Talend workspace for the first time, it is necessary to specify the location of the custom components and to load them otherwise an error will take place when you open a job, indicating that a component is not recognized. Go into Window > Preferences. Then go to Talend > Components and indicate the location of the "components" folder in the Talend project that has been created. Then click on "Apply"
Talend Documentation
For more information on Talend Open Studio, check out http://www.talend.com .
MagicDraw (UML model)
Generating the UML model
The UML2 model that corresponds to the database can be used for documentation purposes or as an entry point into Blu Age Reverse Modeling or Blu Age Forward Engineering (for more details, check out the documentation for those products).
The model is generated via the contextual menu of the base, by selecting "Transform to UML2", Generating the UML model Fig 1.
The generated files are visible from the "Package Explorer", Fig 2.
Importing the model into Magic Draw made by No Magic Inc
Via the "BluAge Converter" Plugin
First of all, add the "com.mdzip" file to the "profiles" folder in MagicDraw. To add to the plugin, launch MagicDraw, go to Help > Resource Manager. Click on "Import" and select the "com.netfective.bluage.reverse.emf2md.zip", then restart MagicDraw. Go to BluAge Converter > BluAge Convert EMF2XMI. Select the previously generated UML model "model.uml" and choose the target folder, Via the "BluAge Converter" Plugin Fig 1.
This plugin creates not only the MagicDraw project in the target folder but also imports the Blu Age profiles into the project. It also automatically applies the MagicDraw style file as generated by the DBM stereotypes. The colors of the stereotyped elements will then automatically be reused in MagicDraw. The plug-in will also automatically create the different diagrams, Fig 2.
Without the Plugin
Go to File > Import from >Eclipse UML 2 (v3.x) XMI file and select the generated "model.uml" file, Without the Plugin Fig 1.
One must then add the BluAge profiles via the File > Use Profile/Module menu and import them one by one from the "profiles" folder in MagicDraw in order to put them in the BluAge packaging/profile within the project, just as you would with the plugin (see Via the "BluAge Converter" plugin). If this is not done, it will become impossible to load the project onto a Teamwork Server (No Magic Inc tool), Fig 2.
The elements
Decorator
It is possible to activate and deactivate the element decorators (additional information about an element on its label or icon). Go to Window > Preferences. Then go to General > Appearance > Label Decorations and then check or uncheck the decorators that you would like to keep.
Generated
DBM is very tightly linked to the use of refactorings. Nevertheless, to every rule there is an exception.
As far as handling exceptions is concerned, the "Generated" checkboxes that follow each pièce of information are very important.
They tell you if the corresponding information was provided automatically by a refactoring or not.
By modifying any of these informations, you escape the "Generated" mode : the information will no longer be modified by any future refactoring.
To shift back into "Generated" mode, you need only re-check the checkbox and launch the corresponding refactoring.
Legacy and Added
All the elements of a base are either "Legacy" [the element exists in the original base] or "Added" [the element is added during the modernization]. "Legacy" elements contain attributes allowing them to differentiate their original state from the modernized one. This allows them to maintain a trace of the starting base. A typical example is the management of element names. A "Legacy" element has a "name" field that corresponds to its original name. But it also has the "databaseName" field that allows it to define its name in the future base. We can then use the visualization mode of the modernized names (see "Legacy" Connection) to work more easily with the new names. For the "Added" elements, the name and the modernized name are identical (DBM will automatically modify the other field during the modification of one of two names). If one does not want a "Legacy" element to appear in the modernized base, one should not eliminate it (which is impossible anyway) but rather check the "to delete" attribute. An added element can be directly deleted. During the modernization of a repository, the "Added" elements are transformed into "Legacy" elements in the new repository.
Schema
A schema only has a "name» attribute.
Sequence
- Name : Name of the sequence
- Database Name : Modernized name of the sequence.
- Start Value : Starting value (BigInteger)
- Increment : Increment value (BigInteger)
- Minimum : Minimal value (BigInteger)
- Maximum : Maximal value (BigInteger)
For the "LegacySequence", the modernized values are those that will be used in the new base.
Stereotype
Creating a Stereotype
The DBM stereotypes work in the same way as the UML stereotypes. They allow you to categorize different elements in light of their use in refactorings or simply for visualization. A stereotype can only be applied to a single type of element. Nevertheless, since many types inherit from others, child types can be stereotyped in the same way their parent type is. Since a stereotype is not an SQL element, it does not have the "Legacy" and "Added" concepts.
- Name : Name of the stereotype.
- Color : Color associated with the stereotype.
- Type : Type of element to which the stereotype can be applied.
To affect a stereotype, one must open the element that needs to be stereotyped and click on the
icon.
Activated : allows you to deactivate the stereotype if it does not need to be applied to the element. This is remarkably useful when a refactoring automatically applies the stereotypes and you eventually realize that the element does not need those stereotypes. A stereotype can have its own unique refactorings. They apply uniquely to stereotyped elements.
Creating a TaggedValue
A TaggedValue allows you to define a variable that is tied to the stereotype commonly referred to as a tag. This tag is going to play the rôle of an additional attribute of the stereotyped element. A stereotype can have multiple TaggedValues.
- Name : Name of the tag.
To define the value, click on the name of the stereotype in the list of stereotypes applied to the element. Then édit the value that you want to change. This value is a chain of characters.
A TaggedValue is useful when a refactoring will handle the stereotyped elements.
Table
- Name : Name of a table.
- Database Name : Modernized name of the base.
- Model Name : Name in the UML model of the table.
- Domain : Business domain in the UML model.
- Table Constraints : List of the table constraints (primary key, foreign key, unique constraint)
- Index : List of table indices.
- Comment : Table comments
- Expression : Expresses the manner in which the table recordings are entered.
Column
Name : Name of the column.
- Database Name : Modernized name of the column.
- Model Name : Name in the UML model of the table.
- Default Value : Default value of the column.
- Nullable : Accept or not the null value.
- Table : Column container.
- Type : Actual type of the column (the type that will be used in the modernized base)
- Legacy Type : Type in the old base.
- Move To : Modernized column where the data of that column will be transferred (if this table is to be deleted).
- Expression : Express the manner in which the data will be transferred.
Primary Key
- Name : Name of the primary key
- Database name : Modernized name of the primary key.
- Members : List of columns that make up the primary key.
Foreign Key
- Name : Name of the foreign key.
- Database Name : Modernized name of the foreign key.
- Target Table : Table containing the constrained columns.
- Unique Constraint : Unique constraint of the target table (if defined, the Unique Index is undefined).
- Unique Index : Unique index of the target table (if defined, the Unique Constraint is undefined).
- Members : Columns that make up the foreign key (in agreement with Constraint Members)
- Constraint Members : Colonnes contraintes par la clé étrangère (en concordance avec Members).
- Child Role Name : Name of the role of the child table (table containing the stranger key)
- Child Multiplicity : UML multiplicity of the child table.
- Parent Role Name : Name of the role of the parent table (table constraint)
- Parent Multiplicity : UML multiplicity of the parent table.
- Aggregation : the type of the relation (none, shared, composite).
The Sql Expression tab allows you to express the mapping for the feeding of a foreign key. The nit becomes a matter of coding the condition, just as you would for the where clause of a query.
Index
- Name : Name of the index
- Database name : Modernized name of the index.
- Unique : Determine whether the indexed recordings are unique or not.
- Columns : List the indexed columns.
Refactoring
Single Refactoring
A single refactoring is launched from an element to which it can be applied.
- Name: Name of refactoring
- Refactoring: Opens the tab towards the body of the refactoring
- Description : Description of the refactoring
- Type : Type of element to which the refactoring can be applied.
- Validation Rule : Validation rule to which the refactoring is tied.
- Pre Check : Opens the code tab that checks the element before the refactoring is launched.
- Final Check : Opens the code tab that checks the parameters of the refactoring before the latter is launched.
Global Refactoring
Global refactoring can be used in two ways:
- As a single refactoring. It will only be applied to the selected elements.
- From anywhere, it will be applied to all the elements of the schema in accordance with the parameters.
- Name: Name of refactoring
- Refactoring: Opens the tab to the body of the refactoring
- Description : Description of refactoring
- Type : Type of element to which the refactoring can be applied.
- Validation Rule : Validation rule to which the refactoring is tied.
Macro
A macro is used to generate code for the design of a refactoring by recording the manual actions of the user. To start the recording and stop it, click on the
icon. All the modifications made during the recording are cancelled at the end of the recording and the corresponding code is generated. The generated code is not meant to be used as is but can be a foundation to create the refactorings.
Parameters
To create a parameter for a refactoring, one must right-click on the "Parameters" folder of the refactoring and select "Add a parameter». The name and description of the parameter are required.
If you have already generated the code before creating the parameters, you will have to add them in the déclaration of the "refactor" function. The parameters are chains of characters. When the refactoring is launched, you will be asked to select them.
Code
For more details on this part, refer to the DBM JSDoc. Here is an example of a single refactoring called "Example" with a parameter called "parameter", Code Fig 1.
First of all, here's the code for the precheck that checks that the column name is not empty (otherwise it records an error), Fig 2.
For the finalcheck, the code checks taht the parameter is not empty, otherwise it registers an error, Fig 3.
Finally the refactoring code consists in concatenating the name of the column with the parameter and assigning it to the databaseName, Fig 4.
Refactor is the main function being called, "this" corresponds to the element on which the refactoring is applied and "AddSuffixe" is an additional function. (For more details on the code, see the API documentation)
Launch the refactoring
To launch a refactoring, open the contextual menu of the element (right-click) then go to Refactorings. Then, you have the choice between the global or simple refactorings. If you choose a global refactoring, the element you have selected does not matter because the refactoring will be applied to all the elements. However, if you chose a simple refactoring, only the selected element will be affected. It is worth noting that a global refactoring will appear on the list of simple refactorings if the type that is applied corresponds to the selected element, Launch the refactoring Fig 1. Here's an example with a pre-check error :
The refactoring is cancelled.
Here is an example of an error of warning type from the finalcheck, Fig 4.
You have the choice either to continue, even with an error in your parameters, in which case the refactoring will be executed or to go back, in which case you can redefine the parameter, Fig 5.
When defining the parameter, you are given the choice to either validate or visualize the changes (preview). This previsualization allows one to preview the chages made by the refactoring, Fig 6.
On the left, you can find the modified base, and on the right you can see the original base. You need only click on "OK" to launch the refactoring.
Recovering data from a workbook
It is possible to use an Excel spreadsheet (.xls) in a refactoring. One must then use the "getWorkbook(file_path)" method. The required parameter is a chain of characters that contains the file path. It is recommended that you create a separate folder in the project for this sort of file.
Here's the code to open the spreadsheet :
(Check out the API documentation for more information on how to manipulate the file data)
Base Refactorings
SQLObject
- "ModelNameRefactoring" adds the modelName selected by the user.
- "DatabaseNameRefactoring" adds the databaseName selected by the user.
Table
- "PrimaryKeyRefactoring" creates a new primary key made up of columns whose names are selected by the user.
- "TableWithNoRelationshipRefactoring" sets up the table for deletion.
- «TableWith TooManyIndexesRefactoring" deletes the indices whose names are selected by the user.
- "TableWithoutPKCandidatesRefactoring" creates a primary key made up of columns whose names are selected by the user.
- "RepeatingGroupWithSameTypeRefactoring" creates a child table whose columns are the repeating column.
- "RepeatingGroupWithDifferentTypeRefactoring" affects the type selected by the user.
- "TableWithNoAlternateKeyRefactoring" creates a unique constraint with the columns whose names are selected by the user.
Column
- "TypeRefactoring" affects the type selected by the user.
- "SimilarColumnsWithDifferentTypesRefactoring" affects the type selected by the user.
- "SimilarColumnsNotLinkedRefactoring" créâtes a foreign key with the column that constrains the similar column that is part of the primary key.
Primary Key
- "PrimaryKeyColumnsNullableRefactoring" removes the "nullable" attribute from the columns whose names are selected by the user.
Foreign Key
- "TableWithoutIndexOnFKRefactoring" creates an index over the columns of the foreign key.
Index
- "UniqueIndexColumnsNullableRefactoring" removes the "nullable" attribute from the columns whose names are selected by the user.
- "NonUniqueIndexWithUniqueBehaviorRefactoring" makes the index unique.
- "IndexWithTooManyColumnsRefactoring" dissociates the columns whose names are selected by the index user.
- "IndexColumnsContainedByAnotherIndexRefactoring" sets up the index for deletion.
- "IndexWithAllColumnsToDeleteRefactoring" sets up the index for deletion.
Example of Legacy Pacbase Database modernization
Objective
The modernization of a patrimonial base must be done in several steps. The first step consists only in going from the base in file mode to an SQL relational base that keeps all the PACBASE information. The segments will be replaced by tables while the rubric calls will be replaced by columns. The file data should be transerable to the SQL base without loss of information.
The next steps are then only typical modernizations of relational databases (relabelling, creating/deleting elements, changing types, ….). (See Modernize). The PACBASE language is exploited to the maximum to increase the automation and the recovery semantics (relabelling information, links between tables, etc…) This part présents the elements and the processes needed for the first step of the modernization..
The elements
PacbaseTable
In DBM, the segments are represented by "PacbaseTable". This element is similar to a "LegacyTable". It has an additional element called "Option" that contains the parameters needed to extract the data that correspond to its data file. It will become a table in the modernized base.
Some segments are found as "Group" in other siegments or only contain information relative to the structure of data in the files. In that case, it will be necessary to delete these segments.
Option
The Option element is associated with a "PacbaseTable". It allows you to define which data from the files will feed into the future table.
- Name : the name of the option.
- File : the logical name of the data source in the files that supply the table. This name will be associated with a file. If this file supplies multiple tables, the name must be identical throughout the table options.
- Header Size : the size, in number of lines, of the header of the data file.
- New Line Size : the number of new lines between each recording of the data file.
- Path : the physical path of the data file.
- Encoding : the file encording.For EBCDIC files, the encoding is "Cp037".
- Condition : The condition for filtering data from files. Necessary when the file will supply several other tables. It is important to only recover the necessary adequate data. On the explorer, this condition appears next to the name of the table.
PacbaseColumn
A call to a rubric is represented by a "PacbaseColumn". This element is similar to a LegacyColumn. It has two additional attributes, "ParsingType" and "Signed".
"ParsingType" is the type considered during the Reading of the file in order to transfer the data. In principle, it should be identical to the "LegacyType". Nevertheless, it sometimes happens that the specified data type does not correspond to the data presented in the files. This error is oftentimes detected during the transfer of data when one notices that the data that has been read does not have the right type to be inserted into the new column.
"Signed" only indicates whether the values in the rubric call are signed or not. A rubric call becomes a column in the modernized base.
Types
In the Pacbase files, there are many data types:
- X : strings of characters
- 9 : classical numeric
- 3-8 : compact numeric
Each type has a size. Ex : X(10), string of characters of size 10.
Group
"Group" is a "PacbaseColumn" that can contain other "Group" or "PacbaseColumn". It is differentiated by this icon
.
Generally, these groups should be deleted. However, if you wish to keep one, remove the checkmark on the "To Delete" attribute. In this case, all the elements contained by the group should be deleted. If you also wish to keep the elements contained in the group, remove the checkmark on the "To Delete" attribute of these elements. And finally, if you wish to remove the "Group" by checking the attribute, be aware that this will remove the attribute from the contained elements.
Redefine
"Redefine" is similar to a "PacbaseTable" except that it can be created in a "PacbaseTable", a "Group" or a "Redefine". Moreover, it has a "To Table" attribute that indicates whether one should create a separate table for this "Redefine". In general, if a redefine is transformed into a table, this opération is done the same way for the other "Redefine" of the same group.
is the icon associated with it.
"Redefine" always comes in pairs. The first oftentimes contains a global rubric call while the second contains a set of rubric calls that forms a hodgepodge of the first rubric call (and the second contains calls of rubrics which detail the first rubric call.) Generarly , iIn a modernized base, we only keep one of two data representations to avoid errors to avoid doublons. Therefore, one of the two ought to be eliminated. However, if you wish to keep both because certain data sets require it, you must add a condition on the "Redefine" to know in which case to fill one or the other.
Automatic Modifications
When creating the connection, a set of modifications is done automatically:
- Add a primary key to each table with a column and its associated sequence
- Add foregin keys for each linking information (ex. Hierarchical description in PACBASE)
- Add a "PacbaseColumn" named "DUMMY" that should be deleted because it only reads the file during the transfer.
- By default, all groups should be deleted (see Group Group for more details on this)
By default, the "PacbaseColumn" named "Filler" should also be deleted. Generally, they only served to fill the extra space that was allocated to the data when they did not meet.
Talend Scripts
To generate Talend scripts of the Pacbase model, one should select "Transform to Talend (Pacbase)" in the contextual menu of the base.
(For the rest, check out Talend Open Studio (transfer script)
UML2 Model generation: Sample with Pacbase
Collaborative Work
Sharing a Project
For collaborative work, one must use a version controller. Above all else, consider refreshing the project, otherwise you risk running into problems when the files have to be committed. To share the project, go into the "Package Explorer" view, open the contextual menu of the project (right-click). Go to Team «> Share Project, Sharing a Project Fig 1.
The following pop-up opens Fig 2.
First choose the version controller that you want to use. In this example we use CVS. In the case that you do not have a defined repository, we will ask you to define one, Fig 3. Define the parameters necessary to the repository, Fig 4.
You can choose the name of the project on the repository, Fig 5.
Click on "Finish" to commit all the files.
The commit wizard opens. Click on "Next", add a comment if you wish and click on "Finish". Reminder: If you have a problem , you might have forgotten to refresh your project before the commit.
Import a project
To import a project, open your source manager perspective and check out the project.
Once the project has been imported, create an "import" repository and choose the project you just checked-out (voir Import).
Synchronization and Commit
To synchronize with the project on the repository, open the contextual menu of the base of the "Legacy" connection (right-click) and select "Synchronize". Don't forget to refresh the project before getting started, Synchronization and Commit Fig 1.
The connection is going to be in connected mode and is going to be open the appropriate perspective. You need only select the files that you want to "commit", right-click, select "Commit" and validate with a potential comment, Fig 2 and 3.
Finally, when it's all done, remember to Switch back to offline mode (right click > Work Offline). Warning: Do not go directly through the project in the "Package Explorer" view to start synchronization.
(For more details on the use of version control, check out the appropriate documentation)


