Enhancing and Preparing TIMES for High Performance Computing

Enhancing and preparing TIMES for High
Performance Computing
Evangelos Panos, Tarun Sharma
(Status report on ETSAP funded project)
72nd ETSAP-MEETING | 11th Dec 2017, ETH Zurich, Switzerland

Contents
• Premise
• Transferring TIMES on LINUX: modifications needed
• Application of TIMES on LINUX by running single/batch jobs of scenarios
• Running multiple independent scenarios with TIMES on LINUX
• By manually creating scripts controlling the execution and result collection
• By modifying the TIMES code using the Grid Computing Features of GAMS

Premise
Constraint matrix sizes for different TIMES-based models
• Solution time on a typical workstation:
• Irish TIMES: 11 minutes
• ETSAP TIAM: 1.3 hours
• JRC EU TIMES: 4.2 hours
• SWISS TIMES: 2 hours
• Uncertainty analysis part of the best
practice (DeCarolis, Daly et al. 2017).
• High Performance Computing (HPC): Aggregating computing power to
deliver better performance while solving large problems.
• Most of these systems run on LINUX.

• Transferring TIMES-based models to Linux
• Code is compatible with minor changes
• Running TIMES models on GAMS Linux installation
• Scripts for running batches of model instances
• Transferring results from Linux to Windows for post processing with VEDA_BE
• .gdx files are compatible between the platforms and data transfer can be easily established
• Scripts for converting batches of GDX files to VEDA_BE files
TIMES on LINUX: single scenario

• Making GAMS and CPLEX/Barrier to run on specific CPU cores
• Simultaneously running model instances on selected CPU cores
• This can be done either manually by the user or automated using the GAMS Grid computing
features
• When the user manually generates and runs model instances on selected CPU cores:
• Must generate the *.dd files for every model instance in VEDA-FE
• Must write OS-dependent scripts to run the scenario and assign CPU cores to each scenario
• Must write OS-dependent scripts to check if a scenario has finished and collect its solution
• When using the Grid Features Facility of GAMS then the user:
• Must only write a GAMS solve loop, in each iteration of which a scenario is solved
• GAMS undertakes the assignment to CPUs (it can be also controlled by the user if desired)
• GAMS undertakes the polling to check if a scenario is ready and collects the results
TIMES on LINUX: multiple independent scenarios

Transferring TIMES based model to Linux
Contents of the ETSAP Project Report
1. Setting up a Linux computer ...................................................................................................... 4
1.1. Some basic Linux commands ............................................................................................. 6
1.2. Terminal and File Manager ................................................................................................ 6
2. Installing Gams on Linux........................................................................................................... 6
3. Transferring a TIMES-based model to Linux ............................................................................. 9
4. Running a TIMES model on Linux .......................................................................................... 12
5. Transferring results from Linux to Windows for Post-processing with VEDA_BE ................... 13
6. Scripts ..................................................................................................................................... 14
6.1. Running a TIMES scenario in Linux ................................................................................ 14
6.2. Batch job with multiple scenarios in Linux....................................................................... 15
6.3. Batch job for creating VEDA_BE files in Windows ......................................................... 15
7. Troubleshooting a TIMES run in Linux.................................................................................... 16
7.1. GAMS Error: Scratch directory does not exist .................................................................. 17
7.2. Gams Error: Unable to open input file .............................................................................. 17
7.3. Linux error: File permission error..................................................................................... 17

Transferring TIMES based model to Linux
Contents of the ETSAP Project Report
8. Preparing TIMES for High Performance Computing................................................................ 20
8.1. Multiple parallel scenarios in TIMES using user-defined scripts that are assigned to multiple
CPUs ………………………………………………………………………………………………21
8.1.1. Making GAMS and CPLEX/Barrier to run in specific CPU(s) .................................. 21
8.1.2. Run multiple independent scenarios in parallel ......................................................... 23
8.2. Multiple parallel scenarios in TIMES using the GAMS grid features ................................ 23
8.2.1. Introduction to Grid Computing by using the GAMS language ................................. 24
8.2.2. A simple illustrative example of Grid Computing using GAMS language features .... 25
8.2.3. Equipping TIMES with grid computing features ....................................................... 34
8.2.4. Writing a GAMS file for a TIMES-based model to query the status of the submitted jobs
to the grid ................................................................................................................................ 41
8.2.5. Writing a collection script for TIMES....................................................................... 42
8.2.6. Executing TIMES using the GAMS Grid Computing Features.................................. 44
8.3. Issues for further investigation regarding running TIMES over a Grid of CPUs ................ 50

Transferring TIMES to LINUX
and running a single or a batch job of scenarios

Cross platform compatibility of TIMES source
code
1. Linux OS is case sensitive thus all filenames should be either lowercase or uppercase
2.Run file generated by Veda should be named in lowercase
3.Three major changes are required to the source code:
• calling GAMS with the call option FILECASE=2  this enforces lowercase filenames
• converting all the model names in TIMES code to lowercase, “times” and “times_macro”, to ensure compatibility
with the FILECASE=2 option, when producing the .gdx solution files
• replacing all system calls from GAMS in the TIMES code from MS-DOS commands to POSIX (Portable Operating
System Interface ) commands that operate both under Windows and Linux

Cross platform compatibility of TIMES source
code  3 major changes are required
Sl
No.
File name Current TIMES code (v400) Modification needed for Linux
1 *.run $ SET MODEL_NAME ‘TIMES’ $ SET MODEL_NAME ‘times’
2 maindrv.mod $ IF '%MACRO%' == YES $SET MODEL_NAME
'TIMES_MACRO' SET TIMESED '0' SETLOCAL
SRC tm
$ IF '%MACRO%' == YES $SET MODEL_NAME
'times_macro' SET TIMESED '0' SETLOCAL
SRC tm
3 eqlducs.vda $ if exist cplex.opt execute "copy /Y
cplex.opt+indic.txt cplex.op2 > nul";
$ if exist xpress.opt execute "copy /Y
xpress.opt+indic.txt xpress.op2 > nul";
$ if exist cplex.opt execute "cat cplex.opt
indic.txt > cplex.op2";
$ if exist xpress.opt execute "cat xpress.opt
indic.txt > xpress.op2";

Verification of execution
• the .dd files from Windows generated with VEDA-FE should be transferred to Linux prior execution
• GAMS call requires the paths to the TIMES source files and the .dd files of the scenario run
• Calling GAMS to solve a TIMES instance test.run in Linux from the gams_wrktimes directory, in a Linux
terminal window:
• gams test.run idir=…/gams_srctimesv400 filecase=2 gdx=gamssave/test.gdx
Post processing of the result file .gdx to VEDA-BE
• When the run completes, the generated test.gdx file in Linux is compatible in Windows
• It can be transferred to Windows and processed with the ‘gdx2veda’ utility to generate the .vd* files
required for post-processing with VEDA-BE (the gdx2veda comes with the GAMS installation)
• Calling gdx2veda to generate the .vd* files for VEDA-BE in the command prompt in Windows
• gdx2veda test.gdx C:VEDAVEDA_FEGAMS_SrcTIMESv400times2veda.vdd test

Script in Linux for running a scenario
Running the script from Linux terminal:
etsap-user@etsap-
user:~/gams_wrktimes$./run.sh test
# set the path to times source file
times =$HOME/gams_srctimesv400
# set the path to model data definition files
ddfiles=$HOME/gams_wrktimes
# execute gams to run the model
gams $1.run idir=$times:$ddfiles filecase=2
gdx=$ddfiles/gamssave/$1.gdx
Script name: run.sh
• Why we need a script?
• to encapsulate into it the paths to the source code and to the input files
• to encapsulate all gams options required to run TIMES and obtain the .gdx file with the results
• Thus, we create a script in Linux similar to the VTRUN.cmd script generated by VEDA-FE when running under
Windows

Script for executing a batch of scenarios
Script for processing a batch of gdx to veda
Running the script from Linux terminal to solve 3 scenarios in a batch:
etsap-user@etsap-user:~/gams_wrktimes$./batch_exo.sh “test1 test2 test3”
for s in $1
do
./run.sh $s
done
Script name: batch_exo.sh
for %%s in (%*) do (
gdx2veda %%s.gdx
C:VedaVEDA_FEGams_srcTIMESv400times2veda.vdd
%%s
)
Script name: batch_linux2veda.cmd Running the script from Windows command prompt to
collect the results from 3 scenarios ran in Linux:
C:UsersetsapDesktoplinux2veda>
batch_linux2veda test1 test2 test3

Single CPU Linux vs Windows performance
• Initial tests in a typical desktop with 1 CPU with 8 logical cores, with Linux installed as a guest OS over Windows
• Performance gain in Linux over Windows: Swiss TIMES model 20%, Irish TIMES model 50%
• The improvement in performance seems to depend on the model size (needs further investigation)
Swiss TIMES solution time in Windows : 126 minutes Swiss TIMES solution time in Linux: 100 minutes

Running multiple
independent scenarios
on Linux on different CPU cores
by manually creating execution and result collection scripts
 all the required input files *.dd should have been created in
advance in VEDA_FE

Script to run scenarios on a set of specific cores
Running the script from Linux terminal:
etsap-user@etsap-
user:~/gams_wrktimes$./p_run.sh test 3
#!/bin/bash
# set the path to times source file
times =$HOME/gams_srctimesv400
# set the path to model data definition files
ddfiles=$HOME/gams_wrktimes
# execute gams to run the model
taskset $2 gams $1.run idir=$times:$ddfiles
filecase=2 gdx=$ddfiles/gamssave/$1.gdx
Script name: p_run.sh
CPU core (CPU core Index)
12
(11)
11
(10)
10
(9)
9
(8)
8
(7)
7
(6)
6
(5)
5
(4)
4
(3)
3
(2)
2
(1)
1
(0)
Decimal representation
(CPU indices that are used
in the run as exponents)
Second argument
to the script:
hexadecimal
representation
0 0 0 0 0 0 0 0 1 1 1 1 2^3 + 2^2 + 2^1 + 2^0 =15 f
0 0 0 0 1 1 1 1 0 0 0 0 2^7 + 2^6 + 2^5 + 2^4 =240 f0
1 1 1 1 0 0 0 0 0 0 0 0 2^11 + 2^10 + 2^9 + 2^8 =3840 f00
Second argument to the script: Indexing CPU cores for taskset
• First argument $1: scenario to run
• Second argument $2: index for CPU group on
which solver runs

Execution of a scenario on one core
Second argument determines the cores available to CPLEX/Barrier algorithm for parallelization
in the linear algebra, i.e., 1 for this illustration.

Execution of a scenario on two cores
Second argument determines the cores available to CPLEX/Barrier algorithm for
parallelization in the linear algebra, i.e., 2 (number 1 and 2) for this illustration.

TIMES for High-Performance computing:
Application
• TIMES on FIONN at Irish Centre for High Performance Computing
• Scripts for simultaneous execution of multiple scenario batches, i.e., 1000
scenarios (5 batches of 200 scenarios each) solved in 3 hours
• Work flow optimization
• Experiments:
• Solution time vs number of cores assigned
• 48 virtual cores per node of FIONN
• Such results + further investigation
• Optimal distribution of compute load
• Maximum utilization of computing resources
• Minimization of solution time
0
1
2
3
4
5
0
5000
10000
15000
20000
6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48
timeinhours
timeinseconds
Number of virtual cores
JRC EU TIMES
TIAM 2D scenario

Running multiple
independent
scenarios on Linux on
different CPU cores
by using the GRID COMPUTING language features of GAMS
 this is an exploration of the feasibility and not a proposed design
 better integration in TIMES could be designed and implemented in a separate project and
after coordination with Antti, Amit and every one else who is interested in it

What is Grid Computing
• It combines computers from multiple administrative domains
• It is a form of distributed computing
• A “super virtual computer” is composed of many networked loosely
coupled computers to achieve a goal
• There is a submitting-system where the user creates a large task
• The task is partitioned into smaller tasks
• These smaller tasks are solved from the computers in the grid
• Advantages: Saves money, Increases efficiency, Solves problems
• Disadvantages: Requires unique software, Computers can drop-off
https://www.electronicproducts.com/Computer_Systems/Servers/Cloud_computing_vs_grid_computing.aspx

• It enables users to employ resources from any web-connected device
• Cloud computing provides 3 main services:
• Infrastructure (IaaS): virtual server for storage
• Platform (PaaS): software to create applications
• Software (SaaS): infrastructure and products
• Advantages: saves money, high performance, easy to use
• Disadvantages: security and privacy
What is Cloud Computing
https://www.electronicproducts.com/Computer_Systems/Servers/Cloud_computing_vs_grid_computing.aspx

What is the difference between Grid
Computing and Cloud Computing
• Cloud computing and grid computing have similar goals
• With grid computing:
• the user assigns one large task that gets divided into several smaller portions and
implemented on different machines
• With cloud computing:
• the user enjoys a host of readily available web-based services (without investing in any
underlying architecture); the services can be combined to provide homogenous and
optimised experience

Exploring Grid Computing with TIMES
• GAMS normally operates at a synchronous mode, and waits until the solver terminates
• When independent model solutions are required, then an asynchronous mode is an option to increase
performance
• GAMS provides the asynchronous solving, via the Grid Computing and Multi-threading features
• We need the following loops in the GAMS code:
• Submission Loop: in this phase we will generate and submit models for solutions that can be solved independently
• Inquiry Loop: in this phase we will check which solutions have been finished
• Collection Loop: in this phase the previously submitted models are collected as soon as a solution is available
• This implies that the scenarios of a TIMES-based model are solved via a GAMS-loop instead
of using separate script files one for each scenario

An exploratory implementation in TIMES (1)
• In the .run template file we create a GAMS set with the names of the scenarios we want to solve:
SET GRID_SCENARIOS /scen1, scen2, scen3/;
• In the .run template file we declare a parameter to hold the handle (i.e. the ID) of each submitted scenario:
PARAMETER HANDLE(GRID_SCENARIOS) store the instance handle;
• In the solve.mod file we make GAMS aware that the TIMES model will solve in an Asynchronous mode:
%MODEL_NAME%.solvelink = %solvelink.AsyncGrid%;
• In the solve.mod file we include a loop over the scenarios; in each iteration we update the required TIMES
parameter(s) and then we submit the scenario to GAMS to be solved

The submission loop in a Grid Computing Facility
• In each iteration we update the required TIMES parameter, then we submit the scenario to GAMS and we
keep its handle (the internal ID) for identifying this scenario in the collection phase
* turn on the grid option
%MODEL_NAME%.solvelink = %solvelink.AsyncGrid%;
LOOP(GRID_SCENARIOS,
* update model actual parameters with scenario data (here the CO2 tax)
OBJ_COMNT(R,DATAYEAR,C,S,'TAX',CUR)$ GS_COM_TAXNET (GRID_SCENARIOS, R, DATAYEAR, C, S, CUR) =
GS_COM_TAXNET (GRID_SCENARIOS, R, DATAYEAR, C, S, CUR);
* solve the model
SOLVE %MODEL_NAME% MINIMIZING objZ USING %METHOD%;
* keep the handle of the job for future reference
HANDLE(GRID_SCENARIOS)=%MODEL_NAME%.handle;
);
In this exploratory implementation, we additionally defined the input parameter over the scenario dimension and then we had to
update an internal parameter of TIMES, i.e. this parameter is not visible to the user via the VEDA_FE
More elegant designs and better integration into TIMES could be explored, if the grid computing features are of interest in
ETSAP community, in coordination with Antti, Amit and every one else who is interested in it

Running TIMES in a Grid Computing Facility
• To submit the TIMES model to a Grid Computing Facility, we must call GAMS as follows (Linux example)
gams times_demo_co2.run idir=../times filecase=2 s=submit gdir=grid
• During the submission three folders are created
under folder grid, holding the solutions of the 3
scenarios; the names of the folders correspond
to the handles of the scenarios
• During the solving, in each subfolder in the grid
directory the execution scripts generated
automatically from GAMS and also theb model matrix
and solution of each scenario are kept

Running TIMES in a Grid Computing Facility
• In the first iteration of the loop, GAMS submits scen1, and assigns a handle to it which has the same name as
the corresponding folder in the grid directory

When scen1 is running in CPU1, GAMS submits scen2 in CPU2
Scen1 is at
iteration 15 of
CPLEX/Barrier,
by the time
when scen2 is
submitted
When the
generation of
the model
matrix of
scen2 is
finished, scen1
is already at
iteration 20

Collection of the results from the GRID
• We implement a GAMS source file, similar to the report main driver file of TIMES
• It takes as a call parameter %gams.user1% the name of the scenario
scalar h handle of each scenario;
h:=handlecollect(handle("%gams.user1%"));
%MODEL_NAME%.handle=HANDLE("%gams.user1%");
execute_loadhandle %MODEL_NAME%;
*-----------------------------------------------------------------------------
* produce the reports (THIS CODE IS THE SAME AS IN THE REPORT GENERATOR IN TIMES)
*-----------------------------------------------------------------------------
$ LABEL REPORT
$ BATINCLUDE rptmain. %gams.user2% %gams.user2% NO_EMTY
$ IF NOT %TIMESED%==0
$ IF NOT %TIMESED%==YES $BATINCLUDE wrtbprice.mod
$ IF SET SPOINT $BATINCLUDE spoint. %gams.user2% 0
*-----------------------------------------------------------------------------
* do an check on compile/execute errors from reports
*-----------------------------------------------------------------------------
$ BATINCLUDE err_stat.mod '$IF NOT ERRORFREE' ABORT '*** ERRORS IN GAMS COMPILE ***'
$ BATINCLUDE err_stat.mod ABORT EXECERROR '*** ERRORS IN GAMS EXECUTION ***'
*-----------------------------------------------------------------------------
* dump to the gdx file for VEDA_BE
execute_unload "gamssave/%gams.user1%.gdx";
The collection loop can also be implemented into
TIMES without changing the reporting routines, but
would require some design considerations that were
not pursued at this stage.

Collecting the results for processing in VEDA-BE
• We call the previous script to collect the solution from the grid subdirectory corresponding to the scenario we
want to process in VEDA-BE; this would generate the .gdx file with the solution
gams gc_report.gms idir=..times r=submit gdir=grid filecase=2 user2=mod user1=scen1
• The .gdx files of the scenarios can then be transferred to Windows for processing with VEDA-BE by using the
gdx2veda utility
Linux collection of the .gdx files with the solutions of the 3 scenarios Windows VEDA-BE

Conclusions
• TIMES could be turned into a cross-platform modelling framework with minor code
modification
• Single-CPU solution times seems to be 20 – 50% less in Linux than in Windows (depends on
the model size/structure needs further tests)
• Running multiple independent scenarios can be done either by:
• Manually generating and solving the scenarios through user-defined scripts
• Making use the advanced GAMS language features for Grid Computing (needs modification of the TIMES
source code)
• Using multiple nodes for running multiple scenarios results in optimum distribution of
computation load (maximum use of resources, minimum solution time)

Conclusions
• Suggestions from this report regarding the transferring of TIMES in Linux would be
implemented in the next release of TIMES source code (Antti kindly offered to do this)
• Better integration of grid computing and mutli-trheading features into TIMES could be
considered and implemented if there is a general interest (in coordination with Antti, Amit
and everyone else who is interested in this facility)
• What is next? Exploring smart decompositions of a TIMES-based model such that to exploit
multiple CPU cores for a single scenario run
• Benefiting from the BEAM-ME project (PSI and UCC are in the advisory board)
• Using new GAMS language features regarding model annotation for facilitating decomposition

Thank you for your attention
Our thanks go to:
Gary Goldstein
Antti Lehtilä
Maurizio Gargiulio
Joseph DeCarolis, Sonia Yeh
.. and to many others who tried in the past to run TIMES under Linux
and are not visible to the rest of the TIMES community

Enhancing and Preparing TIMES for High Performance Computing

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Enhancing and Preparing TIMES for High Performance Computing

Ähnlich wie Enhancing and Preparing TIMES for High Performance Computing (20)

Mehr von IEA-ETSAP

Mehr von IEA-ETSAP (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Enhancing and Preparing TIMES for High Performance Computing