[dune-pdelab] Parallel (block?) preconditioners

Wed Nov 2 14:56:51 CET 2011

Sorry, I just realised I made a mistake, with 8 unknowns it should of 
course be 8x8 instead of 10x10.

Eike Mueller wrote:
> Hi Christian,
> 
> many thanks for your reply. I guess in the overlapping case ILU0 will 
> consider the couplings A_{ij} if i is local to the processor and j is 
> within the overlap region (in my case I use an overlap of 1).
> 
> So if I have, for example, a 1d problem with 8 unknowns represented by a 
> graph like this:
> 
> o--o--o--o--o--o--o--o, it will decompose this into
> 
> o--o--o--o--x (processor 1) and x--o--o--o--o (processor 2)
> 
> (o denotes unknowns, x denotes overlap elements and -- couplings)
> 
> the serial matrix will be 10x10, and will be ILU0 decomposed and 
> inverted (whenever the preconditioner is applied), but in the parallel 
> case, each processor will store a 5x5 matrix, which will be ILU0 
> decomposed and inverted locally (whenever the preconditioner is 
> applied), the ISTL consistency model enforces that the vector the 
> preconditioner is applied to is in a unique representation before 
> application of the preconditioner and the result is in a consistent 
> representation (as defined in P. Bastian, M. Blatt. On the Generic 
> Parallelisation of Iterative Solvers for the Finite Element Method In 
> Int. J. Computational Science and Engineering,4(1):56-69, 2008).
> 
> I can imagine that inverting the two 5x5 ILU0 matrices and converting to 
> a consistent representation will give a different result than inverting 
> the 10x10 ILU0 matrix on one processor (ok, in the 1d case it might not 
> give a different result, but in the 2/3d equivalent it probably will), 
> although it is not completely obvious to me at the moment. I can see 
> what happens for 5 sweeps of SSOR in overlapping domains as above, if 
> the halos are not swapped between the sweeps, then this will of course 
> be different than doing 5 sweeps on the  entire domain.
> 
> Apologies for thinking aloud...
> 
> Eike
> 
> Christian Engwer wrote:
>> Hi Eike,
>>
>> in dune-istl the parallel run is usually different from the seq. run.
>> Especially for ILU. ISTL doesn't use global preconditioners, but uses
>> a schwarz style parallelization.
>>
>> Christian
>>
>> On Wed, Nov 02, 2011 at 12:52:06PM +0000, Eike Mueller wrote:
>>> Dear dune-pdelab list,
>>>
>>> I have modified the overlapping ISTL backend for CG with a SSOR
>>> preconditioner to use an ILU0 preconditioner instead, i.e. I wrote a
>>> new
>>>
>>> class ISTLBackend_OVLP_CG_ILU0 :
>>>   public Dune::PDELab::OVLPScalarProductImplementation<GFS>,
>>>   public Dune::PDELab::LinearResultStorage
>>>
>>> [...]
>>>
>>> The 'apply' method, where the preconditioner is initialised, looks
>>> like this:
>>>
>>>      void apply(M& A, V& z, W& r, typename V::ElementType reduction)
>>>      {
>>>        typedef Dune::PDELab::OverlappingOperator<C,M,V,W> POP;
>>>        POP pop(c,A);
>>>        typedef Dune::PDELab::OVLPScalarProduct<GFS,V> PSP;
>>>        PSP psp(*this);
>>>        typedef Dune::SeqILU0<M,V,W,1> SeqPrec;
>>>        SeqPrec seqprec(A,steps);
>>>        typedef
>>> Dune::PDELab::OverlappingWrappedPreconditioner<C,GFS,SeqPrec> WPREC;
>>>        WPREC wprec(gfs,seqprec,c,this->parallelHelper());
>>>        int verb=0;
>>>        if (gfs.gridview().comm().rank()==0) verb=verbose;
>>>          Dune::CGSolver<V> solver(pop,psp,wprec,reduction,maxiter,verb);
>>>          Dune::InverseOperatorResult stat;
>>>          solver.apply(z,r,stat);
>>>          res.converged  = stat.converged;
>>>          res.iterations = stat.iterations;
>>>          res.elapsed    = stat.elapsed;
>>>          res.reduction  = stat.reduction;
>>>          res.conv_rate  = stat.conv_rate;
>>>        }
>>>
>>> Now, will this create a block preconditioner (i.e. when decomposing
>>> the matrix A into L and U, will it treat entries A_{ij} for which i
>>> and j are owned by different processors as zero) or will this
>>> construct the 'full' incomplete LU decomposition? In other words,
>>> will the fill structure of the matrix in a one process run be
>>> identical to the preconditioner in a 4 process run (up to rounding
>>> errors etc)?
>>> I guess the same question can be asked for the SSOR preconditioner
>>> (ISTLBackend_OVLP_CG_SSOR), i.e. if I do 5 SSOR smoothing steps,
>>> will each processor relax its local domain 5 times?
>>>
>>> Thank you very much,
>>>
>>> Eike
>>>
>>> _______________________________________________
>>> dune-pdelab mailing list
>>> dune-pdelab at dune-project.org
>>> http://lists.dune-project.org/mailman/listinfo/dune-pdelab
>>>
>>
> 
> 
> _______________________________________________
> dune-pdelab mailing list
> dune-pdelab at dune-project.org
> http://lists.dune-project.org/mailman/listinfo/dune-pdelab
>