## FOCS 2009

The list of accepted papers for FOCS2009 was published a couple of days ago. The page with the abstracts is here and a post on “Graph Theory” included the pdf file for some of the papers (those available).

I was able to find some of the missing papers:

“A Parallel Repetition Theorem for Any Interactive Argument” – Iftach Haitner.

“Optimal Long Code Test with One Free Bit” – Nikhil Bansal and Subhash Khot.

“Submodular Function Minimization under Covering Constraints” – Satoru Iwata and Kiyohito Nagano.

I’ve been checking out the ones that seem more interesting (for me, don’t want to start a war on best papers or imply that other papers aren’t interesting).

One of them is the paper “Space-Efficient Framework for Top-k String Retrieval Problems” by Wing Kai Hon, Rahul Shah and Jeffrey Scott Vitter.  It’s a really interesting paper about a problem we have been looking at for a while (with Diego Arroyuelo, Meng He and Ian Munro) and I found that the solution they propose is a very interesting approach, it allows to handle a wide family of scoring functions for the top-k string retrieval problem.

Another paper that caught my interest is “Models for the compressible Web” by Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Alessandro Panconesi and Prabhakar Raghavan. I couldn’t find the pdf on that one so I haven’t read it yet. It seems to be closely related to my master thesis (yes, it’s written in English) and it would be interesting to see if some of the results there help on explaining in a theoretical way the empirical results we obtained in the thesis.

## Text Partitioning

On optimally partitioning a text to improve its compression
Written by Paolo Ferragina, Igor Nitto and Rossano Venturini

I found this article really interesting and nicely presented. The problem they focus on is:

Problem: Given a compressor $$C$$ and a text $$T$$ of length $$n$$, drawn from an alphabet $$\Sigma$$ of size $$\sigma$$. Find the optimal partition of $$T=T_1T_2\ldots T_k$$ such that $$|C(T_1)C(T_2)\ldots C(T_k)|$$ is minimized.

This means that we want to cut the text in $$k$$ pieces, with $$k$$ unknown, such that applying the compressor $$C$$ over each piece achieves the best compression possible over the text. This is ignoring possible permutations of the text, such as the Burrows-Wheeler Transform (BWT).

A simple solution is to transform this problem into a shortest path problem, every position in the text is a node in the graph, and every node $$i$$ is connected with nodes $$i+1,i+2,\ldots, n$$. The cost of going from node $$i$$ to node j < i[/latex] is $|C(T_{i,j})|$. It is easy to see that the best partition obtains a total size equal to the minimum path from $1$ to $n$. Here I include a figure of the graph (using IPE :-)). [caption id="attachment_49" align="aligncenter" width="449" caption="Example shortest path for text partitioning"][/caption] The main problem is that just building this graph takes [latex]O(n^3) time. Assume that $$C$$ takes linear time to compress a sequence, then building the graph takes:

$$\sum_{i=1}^{n-1} \sum_{j=i+1}^n j-i =\sum_{i=1}^{n-1}\sum_{j=1}^{n-i} j = \sum_{i=1}^{n-1}\sum_{j=1}^{n-1} j$$

$$= \sum_{i=1}^{n-1} O(n^2) = O(n^3)$$

In the paper they present an algorithm to approximate the problem, they require $$O(n\log_{\epsilon+1}n)$$ time and achieve an $$(1+\epsilon)-$$approximation. The main idea behind the approach is to approximate the graph in such a way that, by storing less edges, they can still approximate the cost of the minimum path. They show how to run the algorithm without building the approximated version of the graph, to keep the space consumption low. They also show how to estimate the size of the compression for 0-order and k-order compressors during the process.

Posted in Research | 1 Comment

## Drawings for Latex

Maybe many of you know many programs for drawing in Linux that allow you to embed the image in a $$\LaTeX$$ document and at the same time use the math typesettings of $$\LaTeX$$. I used to draw in Dia (dia-gnome in Ubuntu), then I devolved to xfig. Ok, lets stop here, I say devolved because I like better dia, not because I want to start a religious fight :-).

A couple of months ago Alejandro showed me a program called IPE (also in Ubuntu repositories). This program is amazing, it is easy to use (and that coming from a really bad user) and it supports $$\LaTeX$$ in a very friendly way, just type latex into the text fields and then with Ctrl+l you make the transition between the code and the actual drawing, you can see the way the picture looks right away and what you see is exactly what you get.

The only problem so far, I still don’t know how to change a text after typing it, but there is a solution for this (besides the brute-force approach of making the whole image from scratch). When you save the image as .ipe file, the file contains an XML that represents the image and has the source of each text, you can modify it there and then load the image into IPE again.

I’m editing the post to add the link to IPE, as Victor suggested:

http://tclab.kaist.ac.kr/ipe/

Posted in Random | 2 Comments

## Cooler address and Latex support

I just moved the blog with it’s old content to continue here, I got a much cooler url! Anyway, the main reason for moving the blog wasn’t the url, it was the $$LaTeX$$ support :-), now I can type geekier things like $$\sum_{i=0}^n 2^i = 2^{n+1}-1$$. And what has this to do with Categories Math and Computer Science? I just used them so they have at least one post :p.

Posted in Personal | 1 Comment

## Wireless Card Linux/Windows

I moved to my new place last month and the second thing I got was the Internet connection (first was the bed). My new apartment is small but has a *really* bad distribution of all the plugs in the walls. Since I didn’t have any good way to connect with a wire my computer to the router, I bought a wireless card. The card is a TRENDnet TEW-421PC/423PI, which is really cheap and it’s supposed to work fine. My apartment is so small that it shouldn’t give any problems, yet it did. The connection did drop randomly from time to time, as always, the drops did occur on the worst moment possible. It did happen in Linux (Ubuntu 9.04 32 bits) and Windows XP, but my notebook was working fine (so I assumed it was the card and not the router). When I was about to take the card back to the place I bought it, I found a page that explained that you could update your driver for the card using windows update. That did work perfectly, and then the problem for Linux was remaining. What I did was to install ndiswrapper with the drivers that came with the card and then copy the file rtl8185.sys from the folder system32 (in Windows) to /etc/ndiswrapper/net8185/. Now my Linux works with the new driver too, until now everything seems to work fine.

## And I’m back…

Yes, after almost one year I’m starting to write again, or maybe starting to write, didn’t write much last time :p. A big difference now is that I’m writing in English. Why? Well, it has a very simple explanation. I’m living in Canada now and most of the people I’ve met last time don’t speak Spanish. The second reason, and the most important one, most people I know will understand the posts in English, maybe using Google translator, but if they want to read it they will. The same doesn’t happen that much in the other direction, so here I go. By the way, this helps me to improve or practice mi writing =).

A couple of weeks ago I discovered Google Reader. By this time I think I’m almost selling my soul to Google: Gmail, Google Calendar, this blog, and now, Google Reader. Still it’s a really great application, it saves me from checking lots of pages that I visit almost every week and sometimes don’t find anything new or too many new things that I don’t know where to start. Now I just visit Google Reader every day and that’s it.

Ok, what’s this post about? I’m not writing the post to promote Google Reader itself, but to list the subscriptions I find more interesting :-). Here’s the list: