[SOLVED] Libre office Calc - cannot figure get REGEX() function to extract second word from sentence

Status
Not open for further replies.
Hi.

Background - I'm used to use Notepad2-mod or Notepad3 to work with regular expression in text files. Libre Office does use slightly different set of syntax. But this I don't get.

I have a spread sheet in Libre Office with this cell content:

headline A1:E1
Text_String; First word; Second word; date; clock

A2 "Varekjøp butikken på nesset Dato 05.08 kl. 14.27 "

B2 =REGEX(A2;"^(\S+) "; ;1)
B2 returns "Varekjøp" as expected

C2 =REGEX(A2;"^\S+ ([:alpha:]+) ";"$1")
C2 returns "butikkenpå nesset Dato 05.08 kl. 14.27 "


So far, extracting the first word seems to work just fine.

However, extracting second word doesn't work as expected.
  • It returns all contents of the remainder characters after second word.
  • It also remove a space between second and third word.

I'd be very glad if somebody could point me in right direction. I hope that If I get to know what I'm doing wrong here, I should be able of also extracting date and time as well (no need to help on that in particular, because I need to learn this better).

Thanks in advance
 
getting the first word
get the length of that word
use that to substr the 2nd search
Well, if I was making a script, then it would be a solution, but I tries to figure a solution within the boundary of the regex() function in Libre Office.

I know I can easily use another plain text editor and do a regular search and replace as a workaround, but I want to stick to Libre Office, and yes I'll try to avoid to use scripting (because in Libre Office the phyton scripts must be located at a specific location in the user home folder, making it not portable, so to speak). And banger (edit: just realized I'm not into street terms, sorry for bad English) to myself - I haven't learned phyton yet (that's totally on me).

Ok bottom line, thanks for suggestion but goes out of scope by using script for this.
 
Update : Months after posting this, I spend some time with a spreadsheet and I finally had my breakthrough on this matter.

The full formula I use to extract date and time look like this:
Code:
=IF(ISNUMBER( FIND("Dato";B3;1));REGEX(B3;"(.*Dato )([0-9]{2}\.[0-9]{2}).*";"$2.2022";1);"")
=IF(ISNUMBER( FIND("Dato";B15;1));REGEX(B15;"(.*kl\. )([0-9]{2})\.([0-9]{2}).*";"$2:$3";1);"")
The if-statement is there so that if no pattern found, insert blank string instead of an error output.

This of course only work in a format of norwegian origin, point is to have a working example.
 
Status
Not open for further replies.