Drupal: is that site hacked?
Some months back I was in the need to know
if a Drupal site was hacked how much code had been modified in a given Drupal site. It was not straightforward to install a copy of the site out of the box (it had hardcoded absolute paths in custom modules among other annoying things).
http://dgo.to/hacked in conjunction with http://dgo.to/diff is a well known solution to guess if core or contrib
are hacked have been modified. This solution is based on a site able to bootstrap and run. As said before, it was difficult to install the site locally. In fact I rejected to spend my time making it work and dealing with its cumbersome admin area. I didn't need to get the site working, just needed to know how much it was hacked, so I did think on a solution based on Drush's make.
The reverse feature of Drush's make is to generate a makefile from a given site. Simply
drush generate-makefile sitename.make. The idea was to get a fresh copy of the same core and projects version and compare them to detect what files had been altered.
So the approach is as follow:
- Generate a makefile from the original site. The resulting makefile is a description of the projects (and their versions) used to build the site. This makefile may not be complete in some cases. For example
makefile-generatedoesn't consider libraries or external projects not hosted in drupal.org. It may also lack effectiveness if projects were git clones (and git_deploy is not enabled) or even worse cvs checkouts. So you may need to edit the resulting makefile and adjust it.
- Run the makefile to another location in order to get a vanilla copy of the same code.
- Compare both directory trees. Here you can use two different approachs: unix's
diffor a python script I did for this same purpose.
And here are the basic command lines:
:/var/www/htdocs-original$ drush generate-makefile ../hackedsite.make
:/var/www/htdocs-original$ cd ..
:/var/www$ drush make hackedsite.make htdocs-vanilla
:/var/www$ diff -r -q --exclude=sites/default/files --exclude=translations htdocs-original htdocs-vanilla
Here's a snippet of the output I get:
Only in htdocs-vanilla/sites/all: README.txt
Only in htdocs-original/sites/all/modules: artistas
Files htdocs-original/sites/all/modules/calendar/CHANGELOG.txt and htdocs-vanilla/sites/all/modules/calendar/CHANGELOG.txt differ
Files htdocs-original/sites/all/modules/calendar/LICENSE.txt and htdocs-vanilla/sites/all/modules/calendar/LICENSE.txt differ
Files htdocs-original/sites/all/modules/calendar/calendar.css and htdocs-vanilla/sites/all/modules/calendar/calendar.css differ
In the above example I used unix's
diff. My python script works esentially the same but provide some advantages and other features.