Why JavaScript email obfuscation does not work
#!/usr/bin/env python
from requests_html import HTMLSession
import re
session = HTMLSession()
resp = session.get(put_your_url_here)
resp.html.render()
email = re.findall(r'[\w\.-]+@[\w\.-]+', resp.html.html)
print(email)
Here’s a page with different obfuscation techniques. Feel free to use it for scrapping.
Comments (4)
Wouter to Why JavaScript email obfuscation does not work
That depends on how you implement it. I pondered on this for a while and in the end decided to use a fake link that triggers a piece of JS that injects the proper e-mail mailto: link into the DOM tree. That works. I tried the above script and it fails for brainbaking.com ;-) Cheers!
over 2 years ago
Paul Philippov to Why JavaScript email obfuscation does not work
Well, you do not publish your email on the page, you dynamically create it upon interaction with the content. That falls out of the test cases. Although, as a visitor, I appreciate the elegance of the approach. This exact form acts somewhat similar and successfully prevents spam comments from bots. But again, it's a dynamic content creation _upon interaction_.
over 2 years ago
Jim Gagnon to Why JavaScript email obfuscation does not work
Should state the version of python and libraries needed to run your code. It's far from trivial to install requests_html and invoke without error.
over 2 years ago
Paul Philippov to Why JavaScript email obfuscation does not work
Jim, Python 2 is deprecated, so it is Python 3. Installation is quite simple: $pip install requests-html
over 2 years ago
Click here to leave a comment