I have thousands of Telegram messages stored in my Elasticsearch index. I need to extract the email addresses that have been mentioned by users on Telegram. email addresses are within [_source][text] and are posted within posts, so I need to use REGEX:
([s]{0,10}[w.]{1,63}@[w.]{1,63}[s]{0,10})
to do the following:
- a) extract the email address from each message;
- b) create a new Maltego entity
I am trying this code (I am totally new to Python/to coding!), but it does not work:
#!/usr/bin/env python
from elasticsearch import Elasticsearch
from MaltegoTransform import *
import json
import os
import re
m = MaltegoTransform()
indexname = sys.argv[1]
es = Elasticsearch('localhost:9200')
res = es.search(index=indexname, size=1000, body={"query": {"match":
{"entities.type": "email"}}})
for doc in res['hits']['hits']:
def get_emails(data=""):
addresses = re.findall(r'[s]{0,10}[w.]{1,63}@[w.]{1,63}[s]{0,10}', data)
print addresses #does not print anything#
m.addEntity('maltego.EmailAddress', ''.join(WHAT?))
m.returnOutput()
This is a sample of my json output:
{
took: 5,
timed_out: false,
_shards: {
total: 1,
successful: 1,
skipped: 0,
failed: 0
},
hits: {
total: 43,
max_score: 7.588423,
hits: [
{
_index: "MY_INDEX",
_type: "items",
_id: "CHANNEL ID",
_score: 7.588423,
_source: {
id: 2411,
audio: { },
author_signature: null,
caption: null,
channel_chat_created: null,
chat: {},
command: null,
service: null,
sticker: { },
supergroup_chat_created: null,
text: HERE'S THE TEXT CONTAINING EMAIL ADDRESS.
The text I need to search into for emails is therefore nested in [_source][text]. I need to extract only the email address (by regex) withi it, and be able to print it and use it in a “function”, in order to create a graph entity in Maltego. The function looks like this:
m.addEntity('maltego.EmailAddress', ''.join(THE EMAIL ENTITY EXTRACTED WITH REGEX)
2
Answers
In the end I was able to get the code working, like this:
The problem with this code is that if multiple email addresses are in the same post, i get the results like [email protected]@domain.com.
Ho can I split the two addresses in order to add each one to my Maltego graph with .join(email)?
Adding the email addresses will depend on what your library requires. The correct approach could be to use
addEntity()
once for each email address, or it might be to add all addresses to a single call.To add each email address using
addEntity()
use:Using
''.join(email)
as you have seen will create a single string with no delimiters between email addresses. To add all email addresses with a,
separating them: