The Majestic FAQ says:
Where is the data coming from?
Our data comes from the World Wide Web itself. The Majestic-12: Distributed Search Engine does not meta-search or otherwise query other search engines: we are the search engine! Over a long period of time we have developed software capable of crawling and indexing large amounts of web data. This index is a big stepping stone towards relevant full-text search. The purpose of the index is to allow relevancy research as well as to help fund continued activites in development of a competitive community-driven general-purpose web-scale search engine.