Siko1056@GSoC 2013 for GNU/Octave: 2013

Sunday, September 15, 2013

That's all folks?

Definitely not! Today is the GSoC "pencils down" day and I want to summarize the state of the project. You can get everything so far from here.

Goals for the Midterm 2013-07-29 (Monday) :

~~Integration of ITSOL completed (iluk.cc, ilut.cc,~~ ilutp.cc, ~~iluc.cc), callable from Octave~~
~~Wrapper file for MATLAB compatibility (ilu.m)~~
~~Small test cases for incomplete LU-factorization, showing their correctness~~
~~Documentation of incomplete LU-factorization types~~
~~Comprehensive test cases for all types of incomplete LU-factorization types (also proofing compatibility to MATLAB)~~
~~(optional) Complex version for the incomplete-factorizations~~

Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

~~Implementations for IC(0) and ICT completed~~
~~Wrapper file for MATLAB compatibility (ichol.m) finished~~
~~Small test cases for all incomplete-factorizations~~
~~Documentation of incomplete LU-factorization types~~
~~Comprehensive test cases for all types of incomplete-factorization types (also proofing compatibility to MATLAB)~~
~~(optional) Complex version for the incomplete-factorizations~~

What there is left to do?

ILUTP must get to work
Make ITSOL a dependency for Octave (there are still some flaws in the official distribution, that make it difficult to use)
IC routines (ichol0jp and icholt) still suffer from Fortran indexing and missing complex versions
Implement the Modified version for all preconditioners.
Everything has to find it's way to the official Octave development repository.

Monday, September 9, 2013

One week left

The final week is coming closer. Here is the current state of my project:

Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

Implementations for ~~IC(0) and~~ ICT completed
~~Wrapper file for MATLAB compatibility (ichol.m) finished~~
Small test cases for all incomplete-factorizations (ICT missing)
Documentation of incomplete LU-factorization types (ICT missing)
Comprehensive test cases for all types of incomplete-factorization types (also proofing compatibility to MATLAB) (ICT missing)
(optional) Complex version for the incomplete-factorizations (ICT missing)

This week ICT and the remaining documentation and test cases will be on my schedule. Last week I implemented a Modified Incomplete Factorization (MIC) option for IC(0), which unfortunately is not 100% compatible to MATLAB. As far as I understood the MIC (see Saad (chapter 10.3.5) for example) you add after a completed step the dropped elements (here marked red) to the main diagonal. MATLAB seems to handle this differently. If anyone knows more, let me know.

A = [ 0.37, -0.05, -0.05, -0.07;

-0.05, 0.116, 0.0, -0.05;

-0.05, 0.0, 0.116, -0.05;

-0.07, -0.05, -0.05, 0.202];

Normal Cholesky Factorization

L1 = chol (A, 'lower');

full(L1)

ans =

0.608276253029822 0 0 0

-0.082199493652679 0.330519656364403 0 0

-0.082199493652679 -0.020442828820163 0.329886850288205 0

-0.115079291113750 -0.179896893617439 -0.191390050272693 0.346068942669187

Incomplete Cholesky Factorization no MIC

opts.type = 'nofill';

opts.michol = 'off';

L2 = ichol (sparse (A), opts);

full(L2)

ans =

0.608276253029822 0 0 0

-0.082199493652679 0.330519656364403 0 0

-0.082199493652679 0 0.330519656364403 0

-0.115079291113750 -0.179896893617439 -0.179896893617439 0.352180311900522

Incomplete Cholesky Factorization with MIC

opts.michol = 'on';

L3 = ichol (sparse (A), opts);

full(L3)

ans =

0.608276253029822 0 0 0

-0.082199493652679 0.320135106613577 0 0

-0.082199493652679 0 0.320135106613577 0

-0.115079291113750 -0.185732393077497 -0.185732393077497 0.346068942669187

The way the dropped element is computed, no problem.

drop = (A(3,2) - L2(2,1) * L2(3,1)) / L2 (2,2)

drop =

-0.020442828820163

Expected compensated diagonal value

D_2_2 = L2(2,2) + drop % * x

D_2_2 =

0.310076827544240

Some factor, the results differ with each other

x = full(L3 (2,2) - L2(2,2)) / drop

x =

0.507980076641026

But the main diagonal D_2_2 must be computed first. The off diagonal can be computed with it.

L_4_2 = (A(4,2) - L3(2,1) * L3(4,1)) / L3 (2,2)

L_4_2 =

-0.185732393077497

The complex input handling works compatible to MATALB. An example:

A = [ 0.37, -0.05, -0.05, -0.07;

-0.05, 0.116, 0.0, -0.05 + 0.05i;

-0.05, 0.0, 0.116, -0.05;

-0.07, -0.05 - 0.05i, -0.05, 0.202];

opts.type = 'nofill';

opts.michol = 'off';

L = ichol (sparse (A), opts);

norm (A - L * ctranspose (L), 'fro') / norm (A, 'fro')

ans = 0.0195288512358234

Sunday, September 1, 2013

Lost in translation with nice results

Last week I spend reimplementing the IC(0) Fortran routines by Mark T. Jones and Paul E. Plassmann (see my last blog post) in C++. This effort was done, because

Octave is naively C++. Thus it is possible to adapt the code to more sophisticated data types (especially I think in complex sparse data types) so it will be able to handle more general input.
I am much more familiar with C++. This makes it easier for me to extend the code with features like the Modified Incomplete Cholesky Factorization (MIC) option and diagonal shifting.

The result is accessible from my repository:

ichol0.cc - The standard IC(0) routine
ichol0jp.cc - IC(0) with Jones and Plassmann strategy (preserve number of non-zeros per row/column, but take the largest ones)

For testing the implementation, I took the two examples from the last post and added one example test case from http://www.mathworks.com/help/matlab/ref/ichol.html. The m-files are available from here.

Octaves output

Test Problem 1

err_ichol0 = 0.0197360236512918

err_ichol0jp = 0.0197548896386716

Test Problem 2

err_ichol0 = 0.0840790133633081

err_ichol0jp = 0.0846790338305638

Test Problem 3

err_ichol0 = 0.0915989875674983

err_ichol0jp = 0.0926062848306512

MATLAB (R2012b) output

Test Problem 1

err_ichol = 0.019736023651292

Test Problem 2

err_ichol = 0.084079013363308

Test Problem 3

err_ichol = 0.091598987567498

Interesting was to observe, that the results of the standard IC(0) match very well with those of MATLAB. The Jones and Plassmann strategy does not differ that much from the standard IC(0), but according to their paper, it should speed up the preconditioned solution. So it will be offered as an additional options.type='jp' for example.

Monday, August 26, 2013

A Fortran story with Algorithm 740

Last week I was looking for an good existing implementation of the Incomplete Cholesky Factorization and finally decided for Algorithm 740 by Mark T. Jones and Paul E. Plassmann. Next to this code, I had a deeper look at many promising libraries, but found nothing I could work upon. Algorithm 740 is a Fortran implementation of the no-fill Incomplete Cholesky Factorization [IC(0) in short] for column wise stored matrices. What do you want more?

To see if this code works, I started writing an interface for Octave and tested it using two small toy problems from Quarteroni, Saleri, Gervasio: Scientific Computing with MATLAB and Octave: Third Edition, Springer 2010.

% Problem 5.1 (Hydraulic network)

A =[-0.37 0.05 0.05 0.07;
    0.05 -0.116 0 0.05;
    0.05 0 -0.116 0.05;
    0.07 0.05 0.05 -0.202];
A = -A;

% Test for positive definite
for i = 1:length (A)
    assert (det (A (1:i,1:i)) > 0)
endfor

% Prepare input for solver
D = diag (A);
A_off = sparse (tril (A - diag (D)));

[L_off, D] = JPICC_wrapper (A_off, D);
L = L_off + diag (D);

% Compute error
R = A - L * L'
norm (R,'fro')

>> norm = 0.0095646

% Problem 5.4 (Capillary networks)

A = [-1/4 1/10 1/10 0 0 0 0 0 0 0 0 0 0 0 0;
     0 -1/2 0 1/5 1/5 0 0 0 0 0 0 0 0 0 0;
     0 0 -1/2 0 0 1/5 1/5 0 0 0 0 0 0 0 0;
     0 0 0 -1 0 0 0 0.4 0.4 0 0 0 0 0 0;
     0 0 0 0 -1 0 0 0 0 0.4 0.4 0 0 0 0;
     0 0 0 0 0 -1 0 0 0 0 0 0.4 0.4 0 0;
     0 0 0 0 0 0 -1 0 0 0 0 0 0 0.4 0.4;
     0 0 0 0 0 0 0 -2 0 0 0 0 0 0 0;
     0 0 0 0 0 0 0 0 -2 0 0 0 0 0 0;
     0 0 0 0 0 0 0 0 0 -2 0 0 0 0 0;
     0 0 0 0 0 0 0 0 0 0 -2 0 0 0 0;
     0 0 0 0 0 0 0 0 0 0 0 -2 0 0 0;
     0 0 0 0 0 0 0 0 0 0 0 0 -2 0 0;
     0 0 0 0 0 0 0 0 0 0 0 0 0 -2 0;
     0 0 0 0 0 0 0 0 0 0 0 0 0 0 -2];
A = -(A + triu (A, 1)');

% Test for positive definite
for i = 1:length (A)
    assert(det(A(1:i,1:i)) > 0)
endfor

% Prepare input for solver
D = diag (A);
A_off = sparse (tril (A - diag(D)));

[L_off, D] = JPICC_wrapper (A_off, D);
L = L_off + diag(D);

% Compute error
R = A - L * L'
norm(R,'fro')

>> norm = 0.53208

The lack of a well-tested library like ITSOL makes this part of the GSoC more difficult. But MATLAB implements "only" IC(0), ICT and the modified IC option. So it will be much work left to do, but it should remain possible within the remaining weeks.

Tuesday, August 20, 2013

Little update

The repository mess

I recreated the structure of my public repository, because of non-conforming Mercurial commit messages. If you already downloaded changesets from my old repository, I recommend to strip them:

hg strip -r MY_OLD_CHANGESET

And to get the newer ones as described here.

On my way to IC(0) and ICT

To get these two implementations I had a look at several libraries again (many of them suggested by Nir):

~~SparseLib++~~ (http://math.nist.gov/sparselib++/):

Contains a very simple implementation of IC(0) in SparseLib++/1.7/src/icpre.cc
Implementation just consists of coping the lower triangular part, this can be done smarter according to the publications I read so far.

SparseLab (http://www2.cs.cas.cz/~tuma/sparslab.html)

Also looks promising. But didn't got it to work so far with the MEX interface.

~~CHOLINCSP~~ (http://theoval.cmp.uea.ac.uk/matlab/default.html#cholinc):

~~Good start for implementing an own version, Mex-Interface with C-Code~~
A deeper look at the underlying paper (http://machinelearning.wustl.edu/mlpapers/paper_files/FineS01.pdf) revealed, that their understanding of an "incomplete Cholesky Factorization" is: "If $Q$ is positive semideﬁnite and singular, then it is still possible to compute an “incomplete” Cholesky factorization $GG^T$, where some columns of $G$ are zero." That is definitely not what MATLABs ichol does.

740 (http://dl.acm.org/citation.cfm?doid=200979.200986)

Not free, but authors are requested. I started to port this function to C++ in order to test it, because it looked as one of the most promising ones.

hsl_mi28-1.1.0 (http://www.hsl.rl.ac.uk/catalogue/hsl_mi28.xml)

Not free, but maybe from their publications one can get useful ideas. I also got an academic version of this function to test.
http://epubs.cclrc.ac.uk/bitstream/9133/RAL-P-2013-004.pdf
http://epubs.cclrc.ac.uk/bitstream/9132/RAL-P-2013-005.pdf

New insights about Scilab and ILU

During the research about Incompete Cholesky factorizations, I found an interesting toolbox Spilu at Scilab. The developers of Scilab took Saads SpimplementedarseKit as basis of their seemingly well tested Fortran functions for nearly all ILUs I interfaced from ITSOL. It is definitly worth to have a look at this library, but I don't know if it was reasonable to change ITSOL to Scilabs implementation. Unfortunately Scilab doesn't seem to have any IC factorization...

News from Yousef Saad

I created another public repository with the changes Marco Atzeri (http://matzeri.altervista.org/works/itsol/) and I made to ITSOL https://code.google.com/p/itsol-clone-siko1056/. The purpose of the repository is not to maintain an own version, it is to facilitate the decision process for upstream changes by Yousef Saad. He wants to consider my changes, when Ruipeng Li is soon available again.

Sunday, August 4, 2013

Mercurial commit message cleaning

Now the midterm is over, one can already think about to make my code ready to be pushed to the official Savannah repository of GNU/Octave. So on the maintainers-mailing list there was a discussion about how to make my changesets fit for the official repository.

Again thanks to Jordi I really learned a lot, how to "manipulate" the Mercurial history. As a first step one should get Mercurial 2.7, because older versions might not support all features used in this post. The current version from the Debian repository does not suffice.

A good point to start is getting a list of all my changesets:

hg log --user k.ohlhus

Now my first change in current development is, that I no longer want to merge the Savannah repository into mine. Instead I got to know about the rebase command from Mercurial, which sets all your nonofficial changesets on top of the history, that makes it easier to track what has changed in my repository.

The old state of my repository is very good explained in Scenario B of the rebase manual page. To fix that I had to run a mix of two commands for all my merges, e.g.:

hg phase --draft MERGE_CHANGESET --force
hg rebase --source MERGE_CHANGESET --dest LAST_CHANGESET_FROM_SAVANNAH

From now all my changes are on top. The second task of cleaning is to tidy up my commit messages, as I did not stick to the commit message guidelines of Octave. By typing

hg histedit -r LAST_BAD_CHANGESET_OF_MINE

I can create an edit plan for my changeset commit messages and carry this plan out.

edit plan example:

mess a623d25a6b42 17121 Startet fixing the transposition problem with all ILU. S

pick 466fb4342423 17122 Interface change in itsol_util.h and further addition of

pick 5afd5d5f42d0 17123 Added MATLAB compatibility wrapper for ILU-factorization

pick 0dbc3caefb6e 17124 Integrating ZITSOL-library to Octave build system.

pick e0b86073ce45 17125 Extended ILU-Preconditioner for complex input using ZITS

pick 9436002c0a0b 17126 Updated test cases. Fixing ILUC output.

pick 9d4e24a01f42 17127 Test cases for complex case increased.

pick 1b6ac1f0991b 17128 Test cases for complex case in ilut.

pick 286aa419eb58 17129 Redirected output of ITSOL and ZITSOL from stderr to tem

pick 975f30ec1c1b 17130 Make most recent development on ILUTP public. Still not

pick 65c0db01a558 17131 Final tidy up on Matlab wrapper with tests.

pick 4244faff69d6 17132 Moved ilu.m from scripts/linear-algebra to scripts/spars

pick 31a15f423dd8 17133 removed and new build entries.

# Edit history between a623d25a6b42 and 31a15f423dd8

# Commands:

# p, pick = use commit

# e, edit = use commit, but stop for amending

# f, fold = use commit, but fold into previous commit (combines N and N-1)

# d, drop = remove commit from history

# m, mess = edit message without changing commit content

This tedious, but necessary work has to find its way to my development repository after my journey (13th of August). Then I'll provide instructions how to get the new changesets and how to remove the older ones. In any case I'm sorry for the inconvenience.

Tuesday, July 30, 2013

GSoC Part 2

Now follows the second part of the Google Summer of Code. Here is what I planned to do in that time.

Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

Implementations for IC(0) and ICT completed
Wrapper file for MATLAB compatibility (ichol.m) finished
Small test cases for all incomplete-factorizations
Documentation of incomplete LU-factorization types
Comprehensive test cases for all types of incomplete-factorization types (also proofing compatibility to MATLAB)
(optional) Complex version for the incomplete-factorizations

TODOs:

~~Clean the development repository.~~

~~Upload the changes~~

Discuss and decide for implementation of IC(0) and ICT

Paper [Jones_Plassmann_1991] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.5862 and a Fortran implementation from the ACM [Jones_Plassmann_1995] http://doi.acm.org/10.1145/200979.200986

Already requested the authors for free usage

Via an Paper [Munksgaard_1980] https://dl.acm.org/citation.cfm?doid=355887.355893 I got attention to the HSL (formerly the Harwell Subroutine Library) http://www.hsl.rl.ac.uk/ which also has a Fortran implementation of ICs. Problem the license again.

Already requested the authors for free usage

Paper [Benzi_2002] http://www.mathcs.emory.edu/~benzi/Web_papers/survey.pdf describes really all-encompassing preconditioning. Michele Benzi and Miroslav Tůma still have the sources for their survey including some MATLAB interface. http://www2.cs.cas.cz/~tuma/sparslab.html

Free Software

Paper [Lin_MORÉ_1999] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.30.1636 present a MATLAB pseudocode based upon [Jones_Plassmann_1995]. No straight forward implementation possible.
Saad gives small pseudocodes in a paper [Saad_1992] http://www-users.cs.umn.edu/~saad/PDF/RIACS-ILQ-TR.pdf and in his book [Saad_2003] http://www-users.cs.umn.edu/~saad/IterMethBook_2ndEd.pdf about the IC(0). The Mathworks cites Saad's book and a paper [Manteuffel_1980] http://www.ams.org/journals/mcom/1980-34-150/S0025-5718-1980-0559197-0/S0025-5718-1980-0559197-0.pdf which describes the shifting algorithm mathematically.

Monday, July 29, 2013

Midterm summary

Today it is time for the midterm and to summarize what has already been done. Here is again my favorite picture to show my progress and afterwards some text.

Goals for the Midterm:

Integration of ITSOL completed (iluk.cc, ilut.cc, ilutp.cc, iluc.cc), callable from Octave
Wrapper file for MATLAB compatibility (ilu.m)
Small test cases for incomplete LU-factorization, showing their correctness
Documentation of incomplete LU-factorization types
Comprehensive test cases for all types of incomplete LU-factorization types (also proofing compatibility to MATLAB)
(optional) Complex version for the incomplete-factorizations

All in all I'm satisfied with the first part of this GSoC. I wrote about 2374 SLOC (rough wc -l estimate), patched two libraries and modified some config files. The major part of my work except the integration into the Octave build system can be downloaded from here to try it out without needing a development version.

Going from the bottom up in the picture, I'm not an expert, but very familiar with ITSOL and ZITSOL now. I created patches for both libraries (also included into my archive above) that one can use them on a unix-like system, but so far they are only tested on my *buntu systems. A special part of this work is, that I patched a whole ZILUC function into ZITSOL, which should be used with care. All other complex version are mainly implemented by switching the data type. For small examples this approach works at ILUC as well, but some differences where observable.

With patched libraries one can go an level up, where I created interfaces to use the most relevant function from the preconditioner libraries. For mastering this task I needed to get to know the different matrix storage formats of Octave and (Z)ITSOL and had some trouble with the matrix conversion. Building upon Wei Jins initial work I successfully interfaced ILUK, ILUT and ILUC + their complex versions. They all deliver comparable results to Matlab. Sadly the authors of (Z)ITSOL seem to be busy at this moment, so I still don't have any clue how to get ILUTP to run. This will tried to be solved in the second part of GSoC, too.

When you have an interface, it still needs to be integrated in the big Octave code base. Therefore I got familiar with the Octave build system and made all the sufficient changes, including checking for the presence of ITSOL and ZITSOL and making sure that all necessary files are build by default.

On the last level there is "only" one m-file ensuring the compatibility to MATLABs ilu-function. As one can see from the picture, I couldn't make everything green because of two reasons. First ILUTP is missing and second I cannot include an efficient MILU (modified ILU, see http://www.mathworks.de/de/help/matlab/ref/ilu.html) without changing the code of ITSOL entirely. The following two examples work in Octave and MATLAB the same, especially the last example reveals the benefit of preconditioning.

Example 1

A = gallery('neumann', 1600) + speye(1600);
setup.type = 'nofill';
nnz(A)
ans = 7840

nnz(lu(A))
ans = 126478

nnz(ilu(A,setup))
ans = 7840

Example 2

A = gallery ('wathen', 10, 10);
b = sum (A,2);
tol = 1e-8;
maxit = 50;
opts.type = 'crout';
opts.droptol = 1e-4;
[L, U] = ilu (A, opts);
x = bicg (A, b, tol, maxit, L, U);
norm(A * x - b, inf)

All written functions contain much documentation, callable with doc ilu* and comprehensive test cases, which you can try out from the archive with
test ilu
test src/iluc
test src/iluk
test src/ilut
because *.oct file and source file are in separate directories.

Besides coding, I additionally had the pleasure to join this years OctConf in Milan. I was given the opportunity to know better the community and held a short presentation about my project.

Finally I want to thank the Octave community for supporting me a lot during the initial part of my project work and I really look forward for the second part and the time after this!

Sunday, July 21, 2013

A little spring-cleaning

In order not to introduce too many new files for the ILU preconditioners in libinterp/dldfcn I made heavy use of C++ templates and merged real and complex versions into one file. This decision has been supported by the fact, that nearly all steps for real and complex input are the same, just function names and data types have to be changed accordingly. I hope not to be the only one agreeing on this trade-off between code duplication and clarity :-/

The integration of ITSOL and ZITSOL into GNU/Octave so far can be summarized as follows:

	ITSOL	ZITSOL
ILUK*	yes	yes
ILUT	yes	yes
ILUC*	yes	experimental**
ILUTP*	no	no

* Function is important for the MATLAB compatibility wrapper ilu.m
** Own complex implementation analogous to real version

The current interface can be obtained ~~from this archive (does not require a development version of GNU/Octave) or~~ from my repository.

Friday, July 19, 2013

Becoming complex with ZTISOL

Two open tasks are remaining for me until the midterm next week. I want to write more comprehensive tests and I have to get ILUTP to work. For the latter I don't think to succeed without further support by the authors of ITSOL, which I already contacted.

While waiting for a response, Nir and I decided to focus on my optional goal of complex ILU versions which are realized in the ZITSOL library. It is straight forward to interface them analogous to the real versions. After tiding up my code, I will submit the changes to my development repository this weekend. As a first step I spend some time getting ZITSOL in a appropriate library format analog to the ITSOL library.

Build the ZITSOL-library from source

Get the source tar-archive
Extract the archive: tar -xvf ZITSOL_1.tar.gz
Get my patch
Apply the patch to the extracted archive:

cd ZITSOL_1
patch -p1 < itsol.patch

Build the library using: make

Now the shared and static library are created in ZITSOL_1

Copy the four header files from ZITSOL_1 and ZITSOL_1/LIB to /usr/include/zitsol (root rights may be required)
Copy the two libraries from ZITSOL_1 to /usr/lib
Make symbolic links

ln -s libzitsol.so.1.0.0 libzitsol.so.1
ln -s libzitsol.so.1 libzitsol.so

Sunday, July 14, 2013

A MATLAB wrapper

Now there exists a wrapper ilu.m which should guarantee compatibility to calls of the ilu function from MATLAB. After an input checking the types are assigned as follows:

'nofill' --> iluk()
'crout' --> iluc()
'ilutp' --> ilut() // the real ilutp still doesn't work for me...

ilu.m still needs lots of test cases. To create them, I want to compare MATLABs behavior, which I have access to in my university. A little weird is, that the second test case taken from the MATLAB ilu description, which I posted earlier about, doesn't seem to work any longer. Maybe that has something to do with the conversion changes? I'll check this. Additionally the crout version example does not produce the same output, as MATLAB does. This might have the reason, that I cannot offer the modified ILU (MILU) option presented there.

To get an overview of my work, see the updated schedule.

Monday, July 8, 2013

The transposition "problem"

A little late, but here is an excerpt of what I did last week. During the interface development I was confronted with the problem of transposing the input and output matrices, as described in my previous blog entry.

With some help of my Master Thesis mentor we finally figured out how to use ITSOL without transpositions.

1. What does ITSOL internally?

ITSOL doesn't perform a "classical" LU-decomposition

$ \mathbf{A} = \mathbf{L}\mathbf{U} $,

it performs a memory efficient decomposition of the form:

$ \mathbf{L'} = \begin{pmatrix}
0 & & & \\
& 0 & & \\
& x & \dots & \\
x & x & & 0 \\
\end{pmatrix} $

$ \mathbf{U'} = \begin{pmatrix}
0 & & & x \\
& 0 & & x \\
& & \dots & \\
& & & 0 \\
\end{pmatrix} $

$ \mathbf{D'} = \begin{pmatrix}
d_1 \\
d_2 \\
\dots \\
d_n \\
\end{pmatrix} $

Keep in mind, that only the non-zero elements $ \mathbf{x} \in \mathbb{R} \neq 0 $ are explicitly stored in the ITSOL compressed row storage format. To obtain the "classical" LU-decomposition, one has to add a diagonal of ones to L' and the diagonal D' to U'.

$ A = \underbrace{\begin{pmatrix}
1 & & & \\
& 1 & & \\
& x & \dots & \\
x & x & & 1 \\
\end{pmatrix}}_{\mathbf{L}}\cdot \underbrace{\begin{pmatrix}
d_1 & & & x \\
& d_2 & & x \\
& & \dots & \\
& & & d_n \\
\end{pmatrix}}_{\mathbf{U}} $

That is what Wei Jin has already done in his first version of the ITSOL-interface. The drawback of his approach is, that one has to convert the input Matrix A first into the coordinate format (COO) and afterwards into the ITSOL compressed row storage format.

2. How to avoid format conversions and transpositions?

Now comes the tricky part. We found out, that the values of L' and U' are scaled with D' during ITSOLs factorization process. This is important to know, when we do an implicit transposition while converting from Octaves column storage format to ITSOLs row storage format and vice versa.

Using ITSOL standalone we would get a decomposition like

$ \left[\mathbf{L'}, \mathbf{D'},\mathbf{U'}\right] = iluk(\mathbf{A}) $

and I just described how to get from there to the "classical" LU-decomposition. With implicitly transposition of A by converting the input from Octaves format, we get

$ \left[\mathbf{L''}, \mathbf{D'},\mathbf{U''}\right] = iluk(\mathbf{A^{T}}) $.

Note that the diagonal D' is the same in both cases and L'' and U'' are scaled. While converting L'', D' and U'' back to Octaves format, we perform another implicit transposition, thus one will receive $ \mathbf{L''^{T}} $ and $ \mathbf{U''^{T}} $ back in the Octave workspace.

Now the final recipe:

Ignore implicit transposition of input Matrix A (second decomposition case).
Scale $\quad \mathbf{L'} = diag(\mathbf{D'}) \cdot \mathbf{L''} \quad$ and $\quad \mathbf{U'} = \mathbf{U''} \cdot diag(\mathbf{1/D'}) \quad$
Swap $\quad \mathbf{L'} = \mathbf{U'} \quad$ and $\quad \mathbf{U'} = \mathbf{L'} \quad$. This has the effect, that with the implicit transposition back to Octave the triangular shape of both matrices is correct.
Add the diagonal of ones and the diagonal D' like described above.

3. Implementation details

Now some final words where to find this in my code. In the file itsol_util.h one can find the methods for fast conversion from the Octave to ITSOL format itsol_convert_to_cs() itsol_util.h:71 that does nothing more than copying the values to the ITSOL data structure.

The conversion back to Octave resulted in the functions itsol_convert_upper_cs_to_octave() itsol_util.h:135 and itsol_convert_lower_cs_to_octave() itsol_util.h:205. The scaling is not performed as a full matrix multiplication. It is quickly done while converting to the Octave format in itsol_util.h:230 for example. So the overhead is minimal compared to a matrix transposition. The swapping is realized by just changing the names. For example iluk.cc:109:

SparseMatrix L = itsol_convert_lower_cs_to_octave (lu->U, scale, add_diag);

4. The results

more off
clc

A = sprand(4000,4000,1e-3) + speye(4000);
iter = 1000;
% For MATLAB only
%opt.type = 'nofill';
%opt.milu = 'off';

% Time for a transposition
time = cputime();
for i = 1:iter
B=A';
end
time = (cputime() - time) / iter;
fprintf(1, 'Time for transposition:\t%e\n', time);

% Time for the factorization
time = cputime();
for i = 1:iter
% For Octave only
[L, U] = iluk(A);
% For MATLAB only
%[L, U] = ilu(A, opt);
end
time = (cputime() - time) / iter;
fprintf(1, 'Time for iluk:\t\t%e\n', time);

GNU/Octave (3.7.5):

Time for transposition:

3.000190e-04 s

Time for iluk (with transposition):

6.028377e-03 s

Time for iluk (without transposition):

4.480280e-03 s

MATLAB R2012b:

Time for transposition:

3.200000e-04 s

Time for iluk (without transposition):

6.200000e-04 s

Tuesday, June 25, 2013

My little talk on the OctConf 2013

The material of my little talk is on the OctConf 2013 available from here:

https://drive.google.com/folderview?id=0B4V9W1l15-xMOU5fTHJ6ZzlEVXc&usp=sharing

Saturday, June 22, 2013

Getting ITSOL to work

At the moment I'm writing an interface for GNU/Octave, to make use of the ITSOL library. This post should be a collection of information how to get this library on your machine.
As far as I figured out, there are is only an outdated package maintained in the Debian repository for ITSOL, so one has to build the library and copy the necessary files to the file system.

Build the ITSOL-library from source

Get the source tar-archive
Extract the archive: tar -xvf ITSOL_2.tar.gz
Get my patch
Apply the patch to the extracted archive:

cd ITSOL_2
patch -p1 < itsol.patch

Build the library using: make

Now the shared and static library are created in ITSOL_2/LIB

Copy the header files from ITSOL_2/INC to /usr/include/itsol (root rights may be required)
Copy the two libraries from ITSOL_2/LIB to /usr/lib
Make symbolic links

ln -s libitsol.so.1.0.0 libitsol.so.1
ln -s libitsol.so.1 libitsol.so

Hint for steps 6+7: If you choose a custom location, you can build Octave later using the configure options:

--with-itsol-includedir=DIR

--with-itsol-libdir=DIR

See configure --help for details

Linux

Debian/Ubuntu: sudo apt-get install libitsol1 libitsol-dev (outdated)
openSuse: ??
fedora: ??
... : ??

Windows

MinGW: ??
Cygwin: ??

Mac

Friday, June 14, 2013

And then there was iluk()

Little good news for the weekend: my first interface to ITSOLs ILUK is working as desired:

>> A_sparse = spconvert([...
     1 4 2 3 3 4 2 5; % row indices
     1 1 2 3 4 4 5 5; % column indices
     1 2 3 4 5 6 7 8  % non-zero values
     ]');

>> [L,U] = iluk(A_sparse)
L =

Compressed Column Sparse (rows = 5, cols = 5, nnz = 6 [24%])

  (1, 1) ->  1
  (4, 1) ->  2
  (2, 2) ->  1
  (3, 3) ->  1
  (4, 4) ->  1
  (5, 5) ->  1

U =

Compressed Column Sparse (rows = 5, cols = 5, nnz = 7 [28%])

  (1, 1) ->  1
  (2, 2) ->  3
  (3, 3) ->  4
  (3, 4) ->  5
  (4, 4) ->  6
  (2, 5) ->  7
  (5, 5) ->  8

>> L * U - A_sparse % Should be a zero matrix
ans =

Compressed Column Sparse (rows = 5, cols = 5, nnz = 0 [0%])

Another example from MATLAB http://www.mathworks.com/help/matlab/ref/ilu.html

>> A = gallery('neumann', 1600) + speye(1600);
>> nnz(A)
ans =  7840
>> nnz(lu(A))
ans =  126402
>> [L,U] = iluk(A);
>> nnz(L+U-speye(size(A))) % This behaviour will be added to ilu.m later
ans =  7840

Now I'm very optimistic, that the ILUT and ILUC will be interfaced till the end of the next week.

Monday, June 10, 2013

Yet another repository

Thanks to Jordi, hereby I announce a new Mercurial-repository for the development of my GSoC project. It has a public web interface:

http://inversethought.com/hg/octave-kai/

~~Clone my whole repository:~~

    hg clone http://inversethought.com/hg/octave-kai/

~~If you have already an Octave repository and want to give my code a try. First check impact, then pull:~~

    hg incoming http://inversethought.com/hg/octave-kai/ -r kais-work

    hg pull http://inversethought.com/hg/octave-kai/

~~There follow some notes mainly for myself, how one can work with that Mercurial-repository.~~

See new page: http://siko1056-gsoc.blogspot.de/p/getting-my-work.html

Working with bookmarks

hg bookmark kais-work # Create bookmark

hg update -r kais-work # Switch to my bookmark

hg push -B kais-work # Make it public available

Stay up to date with recent development changes:

~~hg pull http://www.octave.org/hg/octave~~
~~hg merge default~~
~~hg commit -m "maint: periodic merge with Octave repository"~~
~~hg update -C tip~~
~~hg bookmark -f kais-work~~
~~hg push ssh://inverse@inversethought.com/hg/repos/octave-kai~~

See new page: http://siko1056-gsoc.blogspot.de/p/getting-my-work.html

A more detailed plan of action

In this post I want to summarize and update my project state and progress. First an overview:

Project timeline:

https://www.google.com/calendar/embed?src=fe9sdg54bacec0epcppj4d0i1g%40group.calendar.google.com&ctz=Europe/Berlin

Goals for the Midterm 2013-07-29 (Monday) :

Integration of ITSOL completed (iluk.cc, ilut.cc, ilutp.cc, iluc.cc), callable from Octave
Wrapper file for MATLAB compatibility (ilu.m)
Small test cases for incomplete LU-factorization, showing their correctness
Documentation of incomplete LU-factorization types
Comprehensive test cases for all types of incomplete LU-factorization types (also proofing compatibility to MATLAB)
(optional) Complex version for the incomplete-factorizations

Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

Implementations for IC(0) and ICT completed
Wrapper file for MATLAB compatibility (ichol.m) finished
Small test cases for all incomplete-factorizations
Documentation of incomplete LU-factorization types
Comprehensive test cases for all types of incomplete-factorization types (also proofing compatibility to MATLAB)
(optional) Complex version for the incomplete-factorizations

Monday, June 3, 2013

Sparse exercise: a potential interface for ITSOL or CRS-based functions

The user wants to factorize this sparse matrix

$ A = \begin{pmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 3 & 0 & 0 & 7 \\ 0 & 0 & 4 & 5 & 0 \\ 2 & 0 & 0 & 6 & 0 \\ 0 & 0 & 0 & 0 & 8 \\ \end{pmatrix} $

Get the matrix into Octave

Just provide a Matrix with three columns:

>> A_sparse = spconvert([...
     1 4 2 3 3 4 2 5; % row indices
     1 1 2 3 4 4 5 5; % column indices
     1 2 3 4 5 6 7 8  % non-zero values
     ]')

A_sparse =

Compressed Column Sparse (rows = 5, cols = 5, nnz = 8 [32%])

  (1, 1) ->  1
  (4, 1) ->  2
  (2, 2) ->  3
  (3, 3) ->  4
  (3, 4) ->  5
  (4, 4) ->  6
  (2, 5) ->  7
  (5, 5) ->  8

What needs to be passed to ITSOL

The approach I would favor in order to interface the ITSOL library is to factorize the transposed matrix to avoid timing problems resulting from input and output format conversion.

$ \mathbf{A} = \mathbf{L}\mathbf{U} \Leftrightarrow \mathbf{A^{T}} = \mathbf{U^{T}}\mathbf{L^{T}} $

Conversion with implicit tranposition.

Information on the Octave sparse matrix format:

Friday, May 31, 2013

Octave-forge package "incomplete-factorization"

1. Approach "Interfacing ITSOL"

~~Currently I plan to do something like this and started to work on an cloned SVN repository from octave -forge:~~

2. Approach "Stand-alone implementations"

A few hours ago I got an offer by Yousef Saad and Ruipeng Li to obtain stand-alone implementations of several incomplete factorizations. https://mailman.cae.wisc.edu/pipermail/octave-maintainers/2013-May/033874.html

I would favor to refactor them to native Octave data types internally and make them directly available as *.oct files for usage. By interfacing the ITSOL library I think there will be a too big memory and time overhead due to data format conversion.

Wednesday, May 29, 2013

Introduction

My name is Kai Torben Ohlhus (aka Siko1056) and this blog is intended to talk about my work on GNU/Octave for this years Google Summer of Code.

Further information about me and my GSoC project you can find within the following links:

Thank you for reading.

Sunday, September 15, 2013

Goals for the Midterm 2013-07-29 (Monday) :

Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

What there is left to do?

Monday, September 9, 2013

Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

Sunday, September 1, 2013

Monday, August 26, 2013

Tuesday, August 20, 2013

The repository mess

On my way to IC(0) and ICT

New insights about Scilab and ILU

News from Yousef Saad

Sunday, August 4, 2013

Tuesday, July 30, 2013

Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

TODOs:

Monday, July 29, 2013

Goals for the Midterm:

Sunday, July 21, 2013

Friday, July 19, 2013

Build the ZITSOL-library from source

Sunday, July 14, 2013

Monday, July 8, 2013

1. What does ITSOL internally?

2. How to avoid format conversions and transpositions?

3. Implementation details

4. The results

Tuesday, June 25, 2013

Saturday, June 22, 2013

Build the ITSOL-library from source

Linux

Windows

Mac

Friday, June 14, 2013

Monday, June 10, 2013

Working with bookmarks

hg bookmark kais-work # Create bookmark hg update -r kais-work # Switch to my bookmark hg push -B kais-work # Make it public available

Stay up to date with recent development changes:

Project timeline:

Goals for the Midterm 2013-07-29 (Monday) :

Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

Monday, June 3, 2013

The user wants to factorize this sparse matrix

Get the matrix into Octave

What needs to be passed to ITSOL

Friday, May 31, 2013

1. Approach "Interfacing ITSOL"

2. Approach "Stand-alone implementations"

Wednesday, May 29, 2013

hg bookmark kais-work # Create bookmark

hg update -r kais-work # Switch to my bookmark

hg push -B kais-work # Make it public available