regex - What is the RegExp Pattern to Extract Bullet Points Between Two Group Words using VBA in Word? -
i can't seem figure out regexp extract bullet points between 2 group of words in word document.
for example:
risk assessment:
- test 1
- test 2
- test 3
internal audit
in case want extract bullet points between "risk assessment" , "internal audit", 1 bullet @ time , assign bullet excel cell. shown in code below have pretty done, except cant figure out correct regex pattern. great. in advance!
sub populateexceltable() dim fd office.filedialog set fd = application.filedialog(msofiledialogfilepicker) fd .allowmultiselect = false .title = "please select file." .filters.clear .filters.add "word 2007-2013", "*.docx" if .show = true txtfilename = .selecteditems(1) end if end dim wordapp word.application set wordapp = createobject("word.application") dim worddoc word.document set worddoc = wordapp.documents.open(txtfilename) dim str string: str = worddoc.content.text ' assign entire document content string dim rex new regexp rex.pattern = "\b[^risk assessment\s].*[^internal audit\s]" dim long : = 1 rex.global = true each mtch in rex.execute(str) debug.print mtch range("a" & i).value = mtch = + 1 next mtch worddoc.close wordapp.quit end sub
this long way around problem works.
steps i'm taking:
- find bullet list items using keywords before , after list in regexp.
(
group)
regexp pattern can extract in-between words.- store listed items group string.
- split string new line character new array.
- output each array item excel.
- loop again since there may more 1 list in document.
note: don't see code link excel workbook. i'll assume part working.
dim rex new regexp rex.pattern = "(\brisk assessment\s)(.*)(internal\saudit\s)" rex.global = true rex.multiline = true rex.ignorecase = true dim linearray() string dim mymatches object set mymatches = rex.execute(str) each mtch in rex.execute(str) 'debug.print mtch.submatches(1) linearray = split(mtch.submatches(1), vblf) x = lbound(linearray) ubound(linearray) 'debug.print linearray(x) range("a" & i).value = linearray(x) = + 1 next next mtch
my test page looks this:
results inner debug.print
line return this:
item 1 item 2 item 3
Comments
Post a Comment