Update (15. Oct 2024): the paper was accepted and has been published!
What am I talking about?
As part of the Data Science Master at TU Wien, we are required to work on the Interdisciplinary Project in Data Science. Students are expected to work (for free) on a research project supervised by another research group or institution that focuses on non-DS topics. The idea is to apply Data Science to a non-Data Science domain.
I just wanted to talk about how great this project turned out, and how good results we got. 👍🏻
Who supervised me?
I worked with the Institute for Transport and Logistics Management of the WU Vienna. And
- Vera Hemmelmayr (Assoz.Prof PD Dr. ?)
- Günther Raidl (Ao.Univ.Prof. Dipl.-Ing. Dr.techn. ?)
supervised me. They provided excellent guidance, and they are very nice. You should work with them if you have a change. Moreover, they gave me the opportunity to publish our work, and they let me be the first author (as it should be, but in academia you never know).
The optimization problem that we solved
We solved the Selective Assessment Routing Problem (SARP), which is the following problem:
- Imagine some natural disaster or emergency (say an earthquake) that affects some region.
- We are a humanitarian organization that will provide aid for the affected population
- The very first step: Rapid Needs Assessment
- Evaluate and prioritize the immediate needs of affected populations
- The next steps (providing the needs) is out of the scope of the problem
- We have some teams that will take cars and visit some affected regions (not all of them, time is limited)
- What sites do we visit and in which order? This is our problem
- In order to select the sites, we assume a set of characteristics of the sites (e.g., geographical, demographic, or socio-economical factors) that should be covered in a balanced way
- This means we wouldn’t visit all the sites on the coast and forget about the sites on the mountains
- Or, for instance, we want to cover the poor or rich areas equally
- In essence, we select the sites fairly in order to get the best picture of the situation
Our Mixed Integer Programming Model
I don’t know if the reader is familiar with Mathematical Programming. I will just assume you are. The SARP was first defined in this paper: Balcik (2017), and the author proposed a MIP model using MTZ constraints. But they mostly focused on some Tabu Search Heuristic.
In the project, we come up with different alternative formulations (Cutting Sets, 2-indexed MTZ, and Single Commodity Flow), we study them theoretically, and computationally test them. Here I present one of them.
Single Commodity Flow formulation for the SARP
The notation used for the problem data
We use the following notation. is the set of affected sites. is the set of teams (or vehicles), and each team departs from the origin and returns back to it after completing the route, which takes at most time units. The travel time is represented by . Let denote the set of critical characteristics. A binary coverage parameter is 1 if node carries characteristic . The total number of sites that carry characteristic is represented by .
The notation used for our formulation
We propose a two-index Single Commodity Flow formulation using the following variables: is the flow circulating from to and is bounded by 0 and . Variables takes value 1 if some vehicle visits site right after , and 0 otherwise. takes value 1 if some vehicle visits site , and 0 otherwise. is a continuous variable modeling the coverage rate, bounded by 0 and 1.
And this is the MIP model:
Our results and findings
We focused on exact methods, which have advantages and disadvantages compared to heuristics. From the exact perspective, one can study different formulations and try to establish whether one is better than another (which corresponds to proving the inclusion of the polyhedral projections of one into the other).
We did this, and it’s very nice because we could prove the Single Commodity Formulation (SCF) that I presented above is strictly stronger than the MTZ formulation in Balcik (2017).
The proof I came up with is quite simple. It’s just checking that it is true. Click to see.
Our computational results are also quite good. Our exact method is comparable in terms of solutions to the Tabu Search heuristic of Balcik.
You can find our implementation and the project report here: Exact Sarp - GitHub - jsalvasoler
What next?
Since the results are very decent, my supervisors suggested writing a paper about it, which is now under review for publication in an Operations Research Journal. But I heard this can take a very long time. 🤷🏻
And I wanted to end this blog post by just reflecting on my learnings:
- MIP modeling is powerful and nice, and proving stuff regarding formulations is not that hard (proof is that I could do it).
- Gurobi (the solver used) is very strong.
- Doing research in optimization topics is great because the problems are so well-defined. Given a problem, we just want to get the best solution given a time budget. But this is actually one of the problems or optimization research: problems in the real world are not well-defined.
- Doing good work (I worked hard) leads to nice opportunities (publishing).