The slides describe some of the aspects of developing a new Epsilon EMC driver. They cover the basics required to implement the IModel interface, follow with some additional details that can be added to the implementation and then provide a small introduction to providing optimized execution of first-order operations on collections.
3. CSV Pro Driver
• A EMC Driver for Comma Separated Value (CSV)
with added functionality
• Each row is an element
• Each column is an attribute
• NEW:
• One column to identify type
• One column used as index – optimized search
4. Overview
• Two Eclipse plugins
• One for the EMC driver implementation
• org.eclipse.epsilon.emc.csvpro
• Implement IModel interface
• Provide Property Getter and Setter
• Provide optimized select operations
• Can run outside eclipse
• One for the driver’s Eclipse-based development tools
• org.eclipse.epsilon.emc.csvpro.dt
• Source code available under
https://git.eclipse.org/c/epsilon/org.eclipse.epsilon
.git/tree/examples
6. Epsilon EOL Engine Overview
for (e in m!Row.all()) {
e.name.println();
e.name = “Epsilon”;
}
Model identifier All of Type
Get property ‘name’
of the element
Set property ‘name’
of the element
7. Activities
• Provide an implementation of the IModel interface
• Retrieve the model
• Get all elements of the model
• Get all elements of the model by type
• Provide an implementation of IPropertyGetter
• Retrieve the value of a property
• Provide an implementation of IPropertySetter
• Set the value of a property
9. Alternatives
• Provide a custom implementation of the data
model - e.g. custom file parser
• CSV, Bibtex, FileSystem, etc…
• Use the data model engine/framework – e.g. wrap
existing file parser
• Databases, EMF, XML, etc…
10. CSV Pro EMC Driver
org.eclipse.epsilon.emc.csvpro
11. Implementing IModel
• Easy: extend CachedModel
(org.eclipse.epsilon.eol.models.Cached
Model<ModelElementType>)
• Cached model improves performance
• Basic implementation of most methods
• Hard: Implement all the interface
12. Implementing IModel
• How to represent the data?
• Existing: Each Row is a Map <property: value>, the
model is a list of maps
• Alternative: Each Row is an Integer (representing line
number), the model is a Map <property: List<values>>,
list of values indexed by row
• Implement IPropertyGetter and IPropertyGetter
• Advice: Extend the existing abstract implementations
23. Implement Configuration Dialog
private void createFileGroup(Composite parent) {
final Composite groupContent =
createGroupContainer(parent, "Files", 3);
final Label modelFileLabel = new Label(groupContent,
SWT.NONE);
modelFileLabel.setText("Model file: ");
fileText = new Text(groupContent, SWT.BORDER);
fileText.setLayoutData(
new GridData(GridData.FILL_HORIZONTAL));
final Button browseFile = new Button(groupContent,
SWT.NONE);
browseFile.setText("Browse Workspace...");
browseFile.addListener(SWT.Selection,
new BrowseWorkspaceForModelsListener(fileText,
"CSV files in the workspace",
"Select a CSV file"));
}
24. Implement Configuration Dialog
protected void createCsvGroup(Composite parent) {
final Composite groupContent = createGroupContainer(parent,
"CSV", 4);
final Label modelFieldSeparatorLabel = new
Label(groupContent, SWT.NONE);
modelFieldSeparatorLabel.setText("Field Separator: ");
fieldSeparatorText = new Text(groupContent, SWT.BORDER);
fieldSeparatorText.setLayoutData(new
GridData(GridData.HORIZONTAL_ALIGN_BEGINNING));
knownHeadersBtn = new Button(groupContent, SWT.CHECK);
knownHeadersBtn.setText("Known Headers");
…
varargsHeadersBtn = new Button(groupContent, SWT.CHECK);
varargsHeadersBtn.setText("Varargs Headers");
}
25. Hook Model to Epsilon
• We need to tell Epsilon there is a new model type
available
• The name of the model type
• The class that implements the model
• The configuration dialog
• An icon for the launch dialog
• Use org.eclipse.epsilon.common.dt.modelType
Extension Point
• Added using the MANIFEST.MF (Extension tab)
26. Hook Model to Epsilon
Must match getModelName() Must match getModelType()
@Override
protected String getModelName() {
return "CSV model";
}
@Override
protected String getModelType() {
return "CSV";
}
27. Hook Model to Epsilon
• Save the user options in a properties object
@Override
protected void storeProperties() {
super.storeProperties();
properties.setProperty(CsvProModel.PROPERTY_FILE,
fileText.getText());
properties.setProperty(
CsvProModel.PROPERTY_FIELD_SEPARATOR,
fieldSeparatorText.getText());
properties.setProperty(
CsvProModel.PROPERTY_HAS_KNOWN_HEADERS,
String.valueOf(knownHeadersBtn.getSelection()));
…
28. Implementing IModel
• With a configuration in place, we can finish the model
implementation
• Implement load (read additional properties), loadModel
(parse text file) and owns
@Override
public void load(StringProperties properties,
IRelativePathResolver resolver)
throws EolModelLoadingException {
super.load(properties, resolver);
this.file = resolver.resolve(
properties.getProperty(PROPERTY_FILE));
this.fieldSeparator =
properties.getProperty(PROPERTY_FIELD_SEPARATOR);
33. Support Types
• Modify the DT plugin to pick a field to define the
type
• Add a new label+text to specify field name
• Optional: Parse the CSV file to present headers as
dropdown for selection
• Modify the CSV Pro Model to support types
• Identify types during model loading
• Modify getAllOfTypeFromModel() to support types
• Modify getAllOfKindFromModel() to support types
• Row is a super type of all types
35. Support Types
• Modify the CSV Pro Model to support types
for (int f=0; f<keys.size(); f++) {
List<String> datavals = data.get(keys.get(f));
if (useTypeColum) {
if (f == typeColum) {
List<Integer> typed =
typedElements.get(values.get(f));
if (typed == null) {
typed = new ArrayList<Integer>();
typedElements.put(values.get(f), typed);
}
typed.add(index);
}
}
datavals.add(values.get(f));
}
38. Select Operation Overview
m!Row.all().select(r | r.id == "531-52-7468");
for (Object element : list) {
if (element.id = "531-52-7468") {
return element;
}
}
SELECT * FROM ROWS WHERE id = "531-52-7468"
optimize
39. Epsilon optimization overview
• To provide optimized execution:
• Inform what operations can be optimized
• Provide an implementation of these operations
• Operations that can be optimized
• Collections returned by the model (e.g. from
selectAllOfTypeFromModel) should implement
IAbstractOperationContributor
• Or, the model implements
IAbstractOperationContributor and keep track of
“owned” collections
• Operation implementation
• Extend the operations to be optimized
40. Optimize select by “id”
• Provide an optimization of the select operation if
the filter field is the id
• Allow the user to specify what column to use for
optimized searches
• Add a data structure to the model that allows fast search
by id (e.g. map)
• Provide a wrapping collection to implement
IAbstractOperationContributor
• Provide a SelectOperation that identifies a select by id
and uses the data model to optimize execution
41. Optimize select by “id”
• Allow the user to specify what column to use for
optimized searches
42. Optimize select by “id”
• Add a data structure to the model that allows fast
search by id
private TreeMap<String, Integer> rows;
…
if (useIndexColum && (f == indexColum)) {
rows.put(values.get(f), index);
}
…
43. Optimize select by “id”
• Provide a wrapping collection to implement
IAbstractOperationContributor
• One possibility is to use the delegate pattern
public class CsvProCollection implements List<Integer>,
IAbstractOperationContributor {
List<Integer> delegate;
…
public boolean contains(Object o) {
return delegate.contains(o);
}
…
@Override
public AbstractOperation getAbstractOperation(String name) {
if ("select".equals(name)) {
return new CsvProCollectionSelectOperation();
}
…
44. Optimize select by “id”
m!Row.all().select(r | r.id == "531-52-7468");
public Object execute(Object target,
Variable iterator,
Expression ast, IEolContext context,
boolean returnOnFirstMatch)
throws EolRuntimeException {}
asttarget iterator
• SelectOperation that identifies a select by id
45. Optimize select by “id”
m!Row.all().select(r | r.id == "531-52-7468");
iterator
Assuming SSN has some ordering,
we want to support ==, >, >=, <=, <,
<>. So ast must be:
• EqualsOperatorExpression
• GreaterThanOperatorExpression
• Etc.
FirstOperand
PropertyCallExpression
TargetExpression
r.id
PropertyNameExpression
iterator
matches
“id “ column name
matches
ast
46. Optimize select by “id”
if (!(ast instanceof EqualsOperatorExpression ||
ast instanceof GreaterThanOperatorExpression ||
ast instanceof GreaterEqualOperatorExpression ||
ast instanceof LessEqualOperatorExpression ||
ast instanceof LessThanOperatorExpression ||
ast instanceof NotEqualsOperatorExpression)) {
return false;
}
Assuming SSN has some ordering, we want to
support ==, >, >=, <=, <, <>. So ast must be:
• EqualsOperatorExpression
• GreaterThanOperatorExpression
• Etc.
47. final OperatorExpression opExp =
(OperatorExpression) ast;
final Expression rawLOperand =
opExp.getFirstOperand();
if (!(rawLOperand instanceof PropertyCallExpression))
{
return false;
}
Optimize select by “id”
PropertyCallExpression
r.id
48. final PropertyCallExpression lOperand =
(PropertyCallExpression) rawLOperand;
final Expression rawTargetExpression =
lOperand.getTargetExpression();
if (!(lOperand.getTargetExpression()
instanceof NameExpression)) {
return false;
}
final NameExpression nameExpression =
(NameExpression) rawTargetExpression;
if (!iterator.getName()
.equals(nameExpression.getName())) {
return false;
}
Optimize select by “id”
TargetExpression
r.id
iterator
matches
49. final NameExpression propertyNameExpression =
lOperand.getPropertyNameExpression();
if (!index.equals(propertyNameExpression.getName())) {
return false;
}
Optimize select by “id”
r.id
PropertyNameExpression
“id “ column name
matches
Lets examine an EOL expression: A loop over all elements of type Row. Print the name attribute of each row
The model identifier tells the engine what model to use (from configuration – discussed later)
Get all elements of the given type
Get the value of a property
Set the value of a property
All elements by type is used by ETL, EVL, etc… to apply a rule to all elements of the context of the rule
There are two alternatives to providing a driver.
Provide a custom implementation of the date model. You code the Java to interact with the model
Use an existing engine/framework to work with the data. The framework provides the functionality, the driver is just a proxy.
First decision is how to represent the data. Own data structure, maps, lists, etc..
Extend the property managers, unless you need to
Our elements would be Integers that represent the line number of each row
Data stores the information of each colum. To get a value for a row the List can be indexed by row number
Since we extended CachedModel, it is important to implement getCacheKeyForType. The simplest is to just return the type.
We only support the ROW type
getAllOfKindFromModel just calls all of type, we don’t support inheritance. In a driver with a metamodel that supports inheritance then all of kind will be different. For example suing instanceOf if the framework provides a Java implementation.
We use an inner class that has access to the data
We use an inner class that has access to the data
Note that this very simple implementation does not have any type checking, object ownership, property validation, etc.. IT IS VERY FRAGILE
Remember to return instances of these classes in your getPropertySetter/Getter methods
We need to allow the user to provide information about the model. DO NOT EXPLAIN THE FIELDS HERE
What needs to be configured?
Identification, cache and load/store exists from AbstractCachedModelConfigurationDialog
We need a browser to select the CSV file
WE need some additional CSV options
One Text field for the file location
One Text field for the separator
Two buttons, headers and varargs
The getModelName/Type is important, later it needs to match the extension point configuration
Next we will see a bit of the SWT code that is involved.
BrowseWorkspaceForModelsListener is provided by the dt tools
Field separator because CSV can use dot, space, tab, semicolon, etc.
Headers are the 1st of the file that indicates the name of each colum
Varargs Headers if a row can have more columns than the headers
We need to inform the Epsilon Framework that there is a new type of model.
We use Eclipse support (OSGI)
The user can now use the launch configuration to provide the model information. We need to save the information so then we can use it to load the model.
Create the constants in the model Class. When loading the model we have access to the properties object. Next we use these properties to load the model. Finish the implementation of the model
We also add an implementation of “owns”
With out going into much detail, we read all the lines in the file. Use the first line to get the property names, then store the data in the Map
Maybe debug to follow code?
Not showing the details of the implementation
Not showing the details of the implementation
Maybe debug to follow code?
Suppose you want to find an element that matches a particular attribute value, e.g. find some one by id (social security number)
When you do a select operation, there is a for loop behind the scenes
If supported by the model data, this could be optimized, e.g. in a database by doing a direct SELECT on the DB, thus delegating the search to the backend
Not going to show details, it is identical to type column selection
All interface methods delegated to the delegate list. WE optimize select operation, we could add others.
Lets analyze the select operation to determine if we can optimize it
Execute is the method in the SelectOperation class we want to overwrite
Target should be a CsvProCollection
Lets analyze the select operation to determine if we can optimize it