# 20th ARCO 2015 ITU

The next meeting in ARCO will take place at the IT University on the 29th of May.

The talks will be held in Auditorium 4 (4th floor, green side).

Please register your participation by using the link below so we can correctly predict the necessary amount of coffee, lunch and seating required. Participation is free of charge.

Registration: http://1drv:ms/1bjCp7n

### Program

9:30-10:00 | Coffee and light breakfast |

10:00-10:30 | Stephen Alstrup Some Labeling Problems |

10:30-11:00 | Dimitris Paraschakis Comparative evaluation of Top-N recommenders in e-commerce: industry perpective |

11:00-11:30 | Coffee break |

11:30-12:00 | Riko Jacob Algorithm Engineering for Numerics: Sparse Grids and the Memory Hierarchy |

12:00-12:30 | Mikkel Abrahamsen An Optimal Algorithm for the Separating Common Tangents of two Polygons |

12:30-13:30 | Lunch buffet at the ITU canteen 1st floor |

13:15-13:30 | Business meeting (Steering committee only) |

13:30-14:00 | Thomas Dueholm Hansen An improved version of the Random-Facet pivoting rule for the simplex algorithm |

14:00-14:30 | Rasmus Pagh Large-scale similarity joins with guarantees |

14:30-15:00 | Coffee and cake |

15:00-15:30 |
Eva Rotenberg |

15:30-16:00 |
Tobias Christiani |

### Abstracts

Some recent results and open problems will be presented

**Comparative evaluation of Top-N recommenders in e-commerce: ****industry perpective**

In this paper we experiment on two real e-commerce datasets and survey more than 20 popular e-commerce platforms to reveal what methods work best for product recommendations in industrial settings. Despite recent academic advances in the field, our results show that simple methods such as best-seller lists still dominate deployed recommendation engines in e-commerce. We found our experimental results to be well-aligned with those of the survey, where in both cases simple personalized recommenders achieved higher ranking than more advanced techniques. We also point to the need for more realistic evaluation methods after comparing the results from the traditional random sampling versus our proposed chronological sampling of events. The proposed method demonstrated the impact of the recency and the sequence of purchases on prediction accuracy and can be used for determining the optimal length of the training history for optimizing the performance of algorithms. This performance is also affected by a proper hyperparameter tuning, for which we propose Golden Section Search as a fast alternative to other optimization techniques.

**From Independence to Expansion and Back Again**

We consider the following fundamental problems:

-- Constructing k-independent hash functions with a space-time tradeoff close to Siegel's lower bound.

-- Constructing representations of unbalanced expander graphs having small size and allowing fast computation of the neighbor function.

It is not hard to show that these problems are intimately connected in the sense that a good solution to one of them leads to a good solution to the other one. In this paper we exploit this connection to present efficient, recursive constructions of $k$-independent hash functions (and hence expanders with a small representation). While the previously most efficient construction (Thorup, FOCS 2013) needed time quasipolynomial in Siegel's lower bound, our time bound is just a logarithmic factor from the lower bound.

Joint work with Rasmus Pagh and Mikkel Thorup

**An Optimal Algorithm for the Separating Common Tangents of two Polygons**

We describe an algorithm for computing the separating common tangents of two simple polygons using linear time and only constant workspace. A tangent of a polygon is a line touching the polygon such that all of the polygon lies to the same side of the line. A separating common tangent of two polygons is a tangent of both polygons where the polygons are lying on different sides of the tangent. Each polygon is given as a read-only array of its corners. If a separating common tangent does not exist, the algorithm reports that. Otherwise, two corners defining a separating common tangent are returned. The algorithm is simple and implies an optimal algorithm for deciding if the convex hulls of two polygons are disjoint or not. This was not known to be possible in linear time and constant workspace prior to this paper.

An outer common tangent is a tangent of both polygons where the polygons are on the same side of the tangent. In the case where the convex hulls of the polygons are disjoint, we give an algorithm for computing the outer common tangents in linear time using constant workspace.

(Accepted for SoCG15.)

**An improved version of the Random-Facet pivoting rule for the simplex algorithm**

The Random-Facet pivoting rule of Kalai and of Matousek, Sharir, and Welzl is an elegant, randomized pivoting rule for the simplex algorithm, the classical combinatorial algorithm for solving linear programs. The expected running time of the simplex algorithm when using this rule is subexponential in the combinatorial size--the number of variables and inequalities--of the linear program. This is currently the best known combinatorial bound for solving general linear programs. Other polynomial time algorithms are known, but their running time depends also on the number of bits in the representation of the linear program.

We present a slightly improved version of the Random-Facet pivoting rule, thus obtaining the fastest known combinatorial algorithm for solving linear programs, the first improvement in over 20 years. Our results apply not only to linear programs, but also to more general, abstract LP-type problems. In particular we also obtain the fastest known algorithm for solving two-player turn-based stochastic games.

Joint work with Uri Zwick.

**Algorithm Engineering for Numerics: Sparse Grids and the Memory Hierarchy **

Sparse grids are a numerical discretization scheme that reduces the

curse of dimensionality. This comes at the price of a more

complicated neighborhood structure and the corresponding memory access

patterns.

In this talk, I will introduce one basic algorithm task for a sparse grid

and discuss its I/O complexity depending on how it is solved.

Together with experimental results for different implementations, this

makes the case that here the I/O-model is appropriate to work with.

**Dynamic Planar Embeddings of Dynamic Graphs**

Joint work with Jacob Holm.

We present an algorithm to support the dynamic embedding in the plane of a dynamic graph. An edge can be inserted across a face between two vertices on the boundary (we call such a vertex pair linkable), and edges can be deleted. The planar embedding can also be changed locally by flipping components that are connected to the rest of the graph by at most two vertices. Given vertices u,v, linkable(u,v) decides whether u and v are linkable, and if so, returns a list of suggestions for the placement of (u,v) in the embedding. For non-linkable vertices u,v, we define a new query, one-flip-linkable(u,v) providing a suggestion for a flip that will make them linkable if one exists. We will support all updates and queries in O(log^2 n) time. Our time bounds match those of Italiano et al. for a static (flipless) embedding of a dynamic graph. Our new algorithm is simpler, exploiting that the complement of a spanning tree of a connected plane graph is a spanning tree of the dual graph. The primal and dual trees are interpreted as having the same Euler tour, and a main idea of the new algorithm is an elegant interaction between top trees over the two trees via their common Euler tour.

**Large-scale similarity joins with guarantees**

The ability to handle noisy or imprecise data is becoming increasingly important in computing. In the database community the notion of similarity join has been studied extensively, yet existing solutions have offered weak performance guarantees. Either they are based on deterministic filtering techniques that often, but not always, succeed in reducing computational costs, or they are based on randomized techniques that have improved guarantees on computational cost but come with a probability of not returning the correct result.

The aim of this talk is to give an overview of randomized techniques for high-dimensional similarity search, and discuss recent advances towards making these techniques more widely applicable by eliminating probability of error and improving the locality of data access.