Sindbad~EG File Manager
3
�fG � @ s d dl Z G dd� d�ZdS )� Nc @ sr e Zd ZdZdZg Zg Zg Zg ZdZ dZ
i g fdd�Zdd� Zd d
� Z
dd� Zd
d� Zdd� Zdd� Zdd� ZdS )�SoSCleanerParsera� Parsers are used to build objects that will take a line as input,
parse it for a particular pattern (E.G. IP addresses) and then make any
necessary subtitutions by referencing the SoSMap() associated with the
parser.
Ideally a new parser subclass will only need to set the class level attrs
in order to be fully functional.
:param conf_file: The configuration file to read from
:type conf_file: ``str``
:cvar name: The parser name, used in logging errors
:vartype name: ``str``
:cvar regex_patterns: A list of regex patterns to iterate over for every
line processed
:vartype regex_patterns: ``list``
:cvar mapping: Used by the parser to store and obfuscate matches
:vartype mapping: ``SoSMap()``
:cvar map_file_key: The key in the ``map_file`` to read when loading
previous obfuscation matches
:vartype map_file_key: ``str``
zUndefined ParserZunsetTc C s. | j |kr| jj|| j � || _| j� d S )N)�map_file_key�mappingZconf_update�skip_clean_files�_generate_skip_regexes)�self�configr � r �/usr/lib/python3.6/__init__.py�__init__2 s
zSoSCleanerParser.__init__c C s2 g | _ x&| j| j D ]}| j jtj|�� qW dS )z�Generate the regexes for the parser's configured parser_skip_files
or global skip_clean_files, so that we don't regenerate them on every
file being examined for if the parser should skip a given file.
N)Z
skip_patterns�parser_skip_filesr �append�re�compile)r �pr r r
r 8 s z'SoSCleanerParser._generate_skip_regexesc C s, | j s
dS x| jjD ]}| jj|� qW dS )z�Generate regexes for items the parser will be searching for
repeatedly without needing to generate them for every file and/or line
we process
Not used by all parsers.
N)�compile_regexesr �datasetZadd_regex_item)r Zobitemr r r
�generate_item_regexesA s z&SoSCleanerParser.generate_item_regexesc C sf d}x&| j D ]}tj||tj�r||fS qW | jrH| j|�\}}||7 }| j|�\}}||7 }||fS )a� This will be called for every line in every file we process, so that
every parser has a chance to scrub everything.
This will first try to identify needed obfuscations for items we have
already encountered (if the parser uses compiled regexes that is) and
make those substitutions early on. After which, we will then parse the
line again looking for new matches.
r )�skip_line_patternsr �match�Ir �!_parse_line_with_compiled_regexes�_parse_line)r �line�countZskip_patternZ_rcount�_countr r r
�
parse_lineM s zSoSCleanerParser.parse_linec C sP d}xB| j jD ]6\}}|j|�r|j| j j|j� �|�\}}||7 }qW ||fS )ah Check the provided line against known items we have encountered
before and have pre-generated regex Pattern() objects for.
:param line: The line to parse for possible matches for obfuscation
:type line: ``str``
:returns: The obfuscated line and the number of changes made
:rtype: ``str``, ``int``
r )r �compiled_regexes�search�subn�get�lower)r r r �item�regr r r r
r a s
z2SoSCleanerParser._parse_line_with_compiled_regexesc C s� d}x�| j D ]�}dd� tj||tj�D �}|r|jdtd� |t|�7 }xF|D ]>}|j� }|| jjj � krlqN| jj
|�}||krN|j||�}qNW qW ||fS )aR Check the provided line against the parser regex patterns to try
and discover _new_ items to obfuscate
:param line: The line to parse for possible matches for obfuscation
:type line: ``str``
:returns: The obfsucated line, and the number of changes made
:rtype: ``tuple``, ``(str, int))``
r c S s g | ]}|d �qS )r r )�.0�mr r r
�
<listcomp>~ s z0SoSCleanerParser._parse_line.<locals>.<listcomp>T)�reverse�key)�regex_patternsr �findallr �sort�len�stripr r �valuesr �replace)r r r �patternZmatchesr Z new_matchr r r
r r s
zSoSCleanerParser._parse_linec C s� | j r@x�| jjD ]*\}}|j|�r|j| jj|j� �|�}qW nJxHt| jjj � ddd� d�D ]*\}}|| jj
krrq\||kr\|j||�}q\W |S )a� Parse a given string for instances of any obfuscated items, without
applying the normal regex comparisons first. This is mainly used to
obfuscate filenames that have, for example, hostnames in them.
Rather than try to regex match the string_data, just use the builtin
checks for substrings matching known obfuscated keys
:param string_data: The line to be parsed
:type string_data: ``str``
:returns: The obfuscated line
:rtype: ``str``
Tc S s t | d �S )Nr )r, )�xr r r
�<lambda>� s z8SoSCleanerParser.parse_string_for_keys.<locals>.<lambda>)r'