Fonctionnalités
- Fusion de N fichiers en un seul,
- Automatic identification of missing or bad format data, smart cleaning (e.g. unification of different data formats)
- Columns management (suppression, shift, desactivation, etc),
- Markov : column creation by time shift,
- Classification : column creation by converting numeric values to finite classes (e.g. slices by step of 10%),
- Text conversion : column creation from text values to numeric values,
- 6 algorithms of sampling,
- Sharpening : sub-node creation by conserving clean data only,
- Several algorithms for troncature or data value simplification,
- Detailed statistical node report,
- No manual modification allowed,
- Colored XY-graph from a third colum, Y-graph,
- 2D-projection node of a data node by using t-SNE algorithm,
- Creation of 2 columns in the data node from t-SNE 2D-projection,
- F-inverse stat report node : : from a chose column C, , create clusters according C-values and compute diameters and inter-distances between clusters,
- N-dimensionnal binning clustering node, containing inter-distances matrix and representation graph,
- Use of forecasting or self-organizing map models from PREDICT for column creation
- File text export,
- Outliers identification and cleaning algorithms for these outliers,
Even on big data sets, DEXTER ajusts its ressource needs to RAM size and CPU charge in order to never ‘block’ the machine.
Idea : to be able to work ‘off the grid’, leaving your machine working the night and get the results on the morning.
Almost all DEXTER functionalities works in background mode, allowing the user to continue to work with DEXTER.