Added support for multiple search results for ScreenScraper.

Also fixed some other scraping issues and added some additional scraper logging.
This commit is contained in:
Leon Styhre 2020-11-14 15:30:49 +01:00
parent f195fcf8a7
commit e5fcb51f57
11 changed files with 129 additions and 72 deletions

View file

@ -424,11 +424,15 @@ The multi-scraper is launched from the main menu, it's the first option on the m
### Scraping process
The process of scraping games is basically identical between the single-game scraper and the multi-scraper. You're presented with the returned scraper results, and you're able to refine the search if the scraper could not find your game. Sometimes just removing some extra characters such as disk information or other data from the search name yields a better result.
The process of scraping games is basically identical between the single-game scraper and the multi-scraper. You're presented with the returned scraper results, and you're able to refine the search if the scraper could not find your game. Sometimes small changes like adding or removing a colon or a minus sign can yield better results. Note that the searching is handled entirely by the scraper service, ES just presents the results returned from the service.
In general the actual file name of the game is used for the scraper, the exception being MAME/arcade games when using TheGamesDB, as the MAME names are then expanded to the full game names.
By default, ES will search using the metadata name of the game. If no name has been defined via scraping or manually using the metadata editor, this name will correspond to the physical file name minus all text inside either normal brackets '()' or square brackets '[]'. So for example the physical filename 'Mygame (U) [v2].zip' will be stripped to simply 'Mygame' when performing the scraping.
Hopefully the scraping process should be self-explanatory once you try it in ES.
The behavior of using the metadata name rather than the file name can be changed using the setting _Search using metadata name_ as described [below](USERGUIDE.md#other-settings).
Note that there is an exception to this behavior for arcade games (MAME and Neo Geo). For ScreenScraper the short MAME names are used by default as this scraper service fully supports that. For TheGamesDB the short names are instead expanded to the full games names using a lookup in the MAME name database supplied with the ES installation. It's possible to override this automatic behavior by using the _Refine Search_ button in the scraper GUI if the search did not yield any results, or if the wrong game was returned. In general though, searching for arcade games is very reliable assuming the physical game files follow the MAME name standard.
Apart from this, hopefully the scraping process should be self-explanatory once you try it in ES.
### Manually copying game media files
@ -695,7 +699,7 @@ This will fill the entire screen surface but will possibly break the aspect rati
**Display game info overlay**
This will display an overlay in the bottom left corner, showing the game name and the game system name. A star following the game name indicates that it's a favorite.
This will display an overlay in the upper left corner, showing the game name and the game system name. A star following the game name indicates that it's a favorite.
**Render scanlines** _(OpenGL renderer only)_
@ -731,7 +735,7 @@ This will fill the entire screen surface but will possibly break the aspect rati
**Display game info overlay**
This will display an overlay in the bottom left corner, showing the game name and the game system name. A star following the game name indicates that it's a favorite.
This will display an overlay in the upper left corner, showing the game name and the game system name. A star following the game name indicates that it's a favorite.
**Render scanlines** _(OpenGL renderer only)_

View file

@ -390,6 +390,15 @@ const bool FileData::isArcadeAsset()
MameNames::getInstance()->isDevice(stem)));
}
const bool FileData::isArcadeGame()
{
const std::string stem = Utils::FileSystem::getStem(mPath);
return ((mSystem && (mSystem->hasPlatformId(PlatformIds::ARCADE) ||
mSystem->hasPlatformId(PlatformIds::NEOGEO))) &&
(!MameNames::getInstance()->isBios(stem) &&
!MameNames::getInstance()->isDevice(stem)));
}
FileData* FileData::getSourceFileData()
{
return this;

View file

@ -70,7 +70,7 @@ public:
const std::string getVideoPath() const;
bool getDeletionFlag() { return mDeletionFlag; };
void setDeletionFlag() { mDeletionFlag = true; };
void setDeletionFlag(bool setting) { mDeletionFlag = setting; };
const std::vector<FileData*>& getChildrenListToDisplay();
std::vector<FileData*> getFilesRecursive(unsigned int typeMask,
@ -87,6 +87,7 @@ public:
virtual std::string getKey();
const bool isArcadeAsset();
const bool isArcadeGame();
inline std::string getFullPath() { return getPath(); };
inline std::string getFileName() { return Utils::FileSystem::getFileName(getPath()); };
virtual FileData* getSourceFileData();

View file

@ -11,12 +11,10 @@
#include "utils/FileSystemUtil.h"
#include "utils/StringUtil.h"
#include "FileData.h"
#include "FileFilterIndex.h"
#include "Log.h"
#include "Settings.h"
#include "SystemData.h"
#include <chrono>
#include <pugixml.hpp>
FileData* findOrCreateFile(SystemData* system, const std::string& path, FileType type)

View file

@ -21,6 +21,7 @@
#include "FileFilterIndex.h"
#include "FileSorts.h"
#include "GuiMetaDataEd.h"
#include "MameNames.h"
#include "Sound.h"
#include "SystemData.h"
@ -333,7 +334,14 @@ void GuiGamelistOptions::openMetaDataEd()
const std::vector<MetaDataDecl>& mdd = file->metadata.getMDD();
for (auto it = mdd.cbegin(); it != mdd.cend(); it++) {
if (it->key == "name") {
if (file->isArcadeGame()) {
// If it's a MAME or Neo Geo game, expand the game name accordingly.
file->metadata.set(it->key, MameNames::getInstance()->
getCleanName(file->getCleanName()));
}
else {
file->metadata.set(it->key, file->getDisplayName());
}
continue;
}
file->metadata.set(it->key, it->defaultValue);
@ -343,8 +351,9 @@ void GuiGamelistOptions::openMetaDataEd()
mWindow->invalidateCachedBackground();
// Remove the folder entry from the gamelist.xml file.
file->setDeletionFlag();
file->setDeletionFlag(true);
file->getParent()->getSystem()->writeMetaData();
file->setDeletionFlag(false);
};
deleteGameBtnFunc = [this, file] {

View file

@ -323,7 +323,7 @@ void GuiScraperSearch::onSearchDone(const std::vector<ScraperSearchResult>& resu
else {
mFoundGame = false;
ComponentListRow row;
row.addElement(std::make_shared<TextComponent>(mWindow, "NO GAMES FOUND - SKIP",
row.addElement(std::make_shared<TextComponent>(mWindow, "NO GAMES FOUND",
font, color), true);
if (mSkipCallback)
@ -659,15 +659,13 @@ void GuiScraperSearch::openInputScreen(ScraperSearchParams& params)
// If the setting to search based on metadata name has been set, then show this string
// regardless of whether the entry is an arcade game and TheGamesDB is used.
if (Settings::getInstance()->getBool("ScraperSearchMetadataName")) {
searchString = params.game->metadata.get("name");
searchString = Utils::String::removeParenthesis(params.game->metadata.get("name"));
}
else {
// If searching based on the actual file name, then expand to the full game name
// in case the scraper is set to TheGamesDB and it's an arcade game. This is required
// as TheGamesDB has issues with searches using the short MAME names.
if (Settings::getInstance()->getString("Scraper") == "thegamesdb" &&
(params.system->hasPlatformId(PlatformIds::ARCADE) ||
params.system->hasPlatformId(PlatformIds::NEOGEO)))
if (params.game->isArcadeGame())
searchString = MameNames::getInstance()->getCleanName(params.game->getCleanName());
else
searchString = params.game->getCleanName();

View file

@ -10,6 +10,7 @@
#include "scrapers/GamesDBJSONScraper.h"
#include "scrapers/GamesDBJSONScraperResources.h"
#include "utils/StringUtil.h"
#include "utils/TimeUtil.h"
#include "FileData.h"
#include "Log.h"
@ -130,14 +131,13 @@ void thegamesdb_generate_json_scraper_requests(const ScraperSearchParams& params
// If the setting to search based on the metadata name has been set, then search
// using this regardless of whether the entry is an arcade game.
if (Settings::getInstance()->getBool("ScraperSearchMetadataName")) {
cleanName = params.game->metadata.get("name");
cleanName = Utils::String::removeParenthesis(params.game->metadata.get("name"));
}
else {
// If not searching based on the metadata name, then check whether it's an
// arcade game and if so expand to the full game name. This is required as
// TheGamesDB has issues with searching using the short MAME names.
if (params.system->hasPlatformId(PlatformIds::ARCADE) ||
params.system->hasPlatformId(PlatformIds::NEOGEO))
if (params.game->isArcadeGame())
cleanName = MameNames::getInstance()->getCleanName(params.game->getCleanName());
else
cleanName = params.game->getCleanName();
@ -457,4 +457,8 @@ void TheGamesDBJSONRequest::process(const std::unique_ptr<HttpReq>& req,
LOG(LogError) << "Error while processing game: " << e.what();
}
}
if (results.size() == 0) {
LOG(LogDebug) << "TheGamesDBJSONRequest::process(): No games found.";
}
}

View file

@ -152,16 +152,22 @@ pugi::xml_node find_child_by_attribute_list(const pugi::xml_node& node_parent,
}
void screenscraper_generate_scraper_requests(const ScraperSearchParams& params,
std::queue< std::unique_ptr<ScraperRequest> >& requests,
std::queue<std::unique_ptr<ScraperRequest>>& requests,
std::vector<ScraperSearchResult>& results)
{
std::string path;
ScreenScraperRequest::ScreenScraperConfig ssConfig;
if (params.game->isArcadeGame())
ssConfig.isArcadeSystem = true;
else
ssConfig.isArcadeSystem = false;
if (params.nameOverride == "") {
if (Settings::getInstance()->getBool("ScraperSearchMetadataName"))
path = ssConfig.getGameSearchUrl(params.game->metadata.get("name"));
path = ssConfig.getGameSearchUrl(
Utils::String::removeParenthesis(params.game->metadata.get("name")));
else
path = ssConfig.getGameSearchUrl(params.game->getCleanName());
}
@ -252,23 +258,54 @@ void ScreenScraperRequest::processGame(const pugi::xml_document& xmldoc,
std::vector<ScraperSearchResult>& out_results)
{
pugi::xml_node data = xmldoc.child("Data");
pugi::xml_node game = data.child("jeu");
if (game) {
ScraperSearchResult result;
ScreenScraperRequest::ScreenScraperConfig ssConfig;
result.gameID = game.attribute("id").as_string();
// Check if our username was included in the response (assuming an account is used).
// It seems as if this information is randomly missing from the server response, which
// also seems to correlate with missing scraper allowance data. This is however a scraper
// service issue so we're not attempting to compensate for it here.
if (Settings::getInstance()->getBool("ScraperUseAccountScreenScraper") &&
Settings::getInstance()->getString("ScraperUsernameScreenScraper") != "" &&
Settings::getInstance()->getString("ScraperPasswordScreenScraper") != "") {
std::string userID = data.child("ssuser").child("id").text().get();
if (userID != "") {
LOG(LogDebug) << "ScreenScraperRequest::processGame(): Scraping using account '" <<
userID << "'.";
}
else {
LOG(LogDebug) << "ScreenScraperRequest::processGame(): The configured account '" <<
Settings::getInstance()->getString("ScraperUsernameScreenScraper") <<
"' was not included in the scraper response, wrong username or password?";
}
}
// Find how many more requests we can make before the scraper request
// allowance counter is reset. For some strange reason the ssuser information
// is not provided for all games even though the request looks identical apart
// from the game name.
unsigned requestsToday =
data.child("ssuser").child("requeststoday").text().as_uint();
unsigned maxRequestsPerDay =
data.child("ssuser").child("maxrequestsperday").text().as_uint();
result.scraperRequestAllowance = maxRequestsPerDay - requestsToday;
unsigned requestsToday = data.child("ssuser").child("requeststoday").text().as_uint();
unsigned maxRequestsPerDay = data.child("ssuser").child("maxrequestsperday").text().as_uint();
unsigned int scraperRequestAllowance = maxRequestsPerDay - requestsToday;
// Scraping allowance.
if (maxRequestsPerDay > 0) {
LOG(LogDebug) << "ScreenScraperRequest::processGame(): Daily scraping allowance: " <<
requestsToday << "/" << maxRequestsPerDay << " (" <<
scraperRequestAllowance << " remaining).";
}
else {
LOG(LogDebug) << "ScreenScraperRequest::processGame(): Daily scraping allowance: "
"No statistics were provided with the response.";
}
if (data.child("jeux"))
data = data.child("jeux");
for (pugi::xml_node game = data.child("jeu"); game; game = game.next_sibling("jeu")) {
ScraperSearchResult result;
ScreenScraperRequest::ScreenScraperConfig ssConfig;
result.scraperRequestAllowance = scraperRequestAllowance;
result.gameID = game.attribute("id").as_string();
std::string region =
Utils::String::toLower(Settings::getInstance()->getString("ScraperRegion"));
@ -325,7 +362,7 @@ void ScreenScraperRequest::processGame(const pugi::xml_document& xmldoc,
result.mdl.get("releasedate");
}
/// Developer for the game( Xpath: Data/jeu[0]/developpeur ).
// Developer for the game (Xpath: Data/jeu[0]/developpeur).
std::string developer = game.child("developpeur").text().get();
if (!developer.empty()) {
result.mdl.set("developer", Utils::String::replace(developer, "&nbsp;", " "));
@ -333,7 +370,7 @@ void ScreenScraperRequest::processGame(const pugi::xml_document& xmldoc,
result.mdl.get("developer");
}
// Publisher for the game ( Xpath: Data/jeu[0]/editeur ).
// Publisher for the game (Xpath: Data/jeu[0]/editeur).
std::string publisher = game.child("editeur").text().get();
if (!publisher.empty()) {
result.mdl.set("publisher", Utils::String::replace(publisher, "&nbsp;", " "));
@ -341,7 +378,7 @@ void ScreenScraperRequest::processGame(const pugi::xml_document& xmldoc,
result.mdl.get("publisher");
}
// Genre fallback language: EN. ( Xpath: Data/jeu[0]/genres/genre[*] ).
// Genre fallback language: EN. (Xpath: Data/jeu[0]/genres/genre[*]).
std::string genre = find_child_by_attribute_list(game.child("genres"),
"genre", "langue", { language, "en" }).text().get();
if (!genre.empty()) {
@ -358,34 +395,6 @@ void ScreenScraperRequest::processGame(const pugi::xml_document& xmldoc,
result.mdl.get("players");
}
// Username, if an account is used for scraping.
if (Settings::getInstance()->getBool("ScraperUseAccountScreenScraper") &&
Settings::getInstance()->getString("ScraperUsernameScreenScraper") != "" &&
Settings::getInstance()->getString("ScraperPasswordScreenScraper") != "") {
// Check if our username was included in the response.
std::string userID = data.child("ssuser").child("id").text().get();
if (userID != "") {
LOG(LogDebug) << "ScreenScraperRequest::processGame(): Scraping using account '" <<
userID << "'.";
}
else {
LOG(LogDebug) << "ScreenScraperRequest::processGame(): The configured account '" <<
Settings::getInstance()->getString("ScraperUsernameScreenScraper") <<
"' was not included in the scraper response, wrong username or password?";
}
}
// Scraping allowance.
if (maxRequestsPerDay > 0) {
LOG(LogDebug) << "ScreenScraperRequest::processGame(): Daily scraping allowance: " <<
requestsToday << "/" << maxRequestsPerDay << " (" <<
result.scraperRequestAllowance << " remaining).";
}
else {
LOG(LogDebug) << "ScreenScraperRequest::processGame(): Daily scraping allowance: "
"No statistics were provided with the response.";
}
// Media super-node.
pugi::xml_node media_list = game.child("medias");
@ -409,6 +418,10 @@ void ScreenScraperRequest::processGame(const pugi::xml_document& xmldoc,
result.mediaURLFetch = COMPLETED;
out_results.push_back(result);
} // Game.
if (out_results.size() == 0) {
LOG(LogDebug) << "ScreenScraperRequest::processGame(): No games found.";
}
}
void ScreenScraperRequest::processMedia(
@ -505,12 +518,31 @@ void ScreenScraperRequest::processList(const pugi::xml_document& xmldoc,
std::string ScreenScraperRequest::ScreenScraperConfig::getGameSearchUrl(
const std::string gameName) const
{
std::string screenScraperURL = API_URL_BASE
std::string screenScraperURL;
// If the game is a arcade game, then search using the individual ROM name rather than
// running a wider text matching search. Also run this search mode if the game name is
// shorter than four characters, as screenscraper.fr will otherwise throw an error that
// the necessary search parameters were not provided with the search.
// Possibly this is because a search using less than four characters would return too
// many results. But there are some games with really short names, so it's annoying that
// they can't be searched using this method.
if (isArcadeSystem || gameName.size() < 4) {
screenScraperURL = API_URL_BASE
+ "/jeuInfos.php?devid=" + Utils::String::scramble(API_DEV_U, API_DEV_KEY)
+ "&devpassword=" + Utils::String::scramble(API_DEV_P, API_DEV_KEY)
+ "&softname=" + HttpReq::urlEncode(API_SOFT_NAME)
+ "&output=xml"
+ "&romnom=" + HttpReq::urlEncode(gameName);
}
else {
screenScraperURL = API_URL_BASE
+ "/jeuRecherche.php?devid=" + Utils::String::scramble(API_DEV_U, API_DEV_KEY)
+ "&devpassword=" + Utils::String::scramble(API_DEV_P, API_DEV_KEY)
+ "&softname=" + HttpReq::urlEncode(API_SOFT_NAME)
+ "&output=xml"
+ "&recherche=" + HttpReq::urlEncode(gameName);
}
// Username / password, if this has been setup and activated.
if (Settings::getInstance()->getBool("ScraperUseAccountScreenScraper")) {

View file

@ -71,6 +71,8 @@ public:
std::string media_screenshot = "ss";
std::string media_video = "video";
bool isArcadeSystem;
// Which Region to use when selecting the artwork.
// Applies to: artwork, name of the game, date of release.
// This is read from es_settings.cfg, setting 'ScraperRegion'.
@ -96,7 +98,7 @@ protected:
std::string region);
bool isGameRequest() { return !mRequestQueue; }
std::queue< std::unique_ptr<ScraperRequest> >* mRequestQueue;
std::queue<std::unique_ptr<ScraperRequest>>* mRequestQueue;
};
#endif // ES_APP_SCRAPERS_SCREEN_SCRAPER_H

View file

@ -218,7 +218,7 @@ void BasicGameListView::remove(FileData *game, bool deleteFile)
// If a game has been deleted, immediately remove the entry from gamelist.xml
// regardless of the value of the setting SaveGamelistsMode.
game->setDeletionFlag();
game->setDeletionFlag(true);
parent->getSystem()->writeMetaData();
// Remove before repopulating (removes from parent), then update the view.

View file

@ -579,7 +579,7 @@ void GridGameListView::remove(FileData *game, bool deleteFile)
// If a game has been deleted, immediately remove the entry from gamelist.xml
// regardless of the value of the setting SaveGamelistsMode.
game->setDeletionFlag();
game->setDeletionFlag(true);
parent->getSystem()->writeMetaData();
// Remove before repopulating (removes from parent), then update the view.