In ARM LR is often used to store address that caused exception e.g data abort, prefetch abort etc. You define handlers for those exceptions, and by looking at LR you can easily get address at which exception was generated.
myetrx:
I was wondering, whether providing seperate register overweighs speed of stack access for push-pop operation
The ARM architecture is an example of 'Reduced Instruction Set Computer' (RISC) processor design. You can learn more about these sorts of issues by researching the features of RISC. For recommended reading see the 'Notes and references' section of the Wikipedia article: