User talk:UDID

From The Urban Dead Wiki
Jump to navigationJump to search
User talk: UDID

Status

Changelog

  • 2024-11-5: UDID-Spider/0.01
Retrieves, caches, & parses pages; checks pages for UD version; checks stats page for total profile # & compares to stored; profile parsing == work in progress.

Discuss project below

That was fast.

Welcome by Bob Moncrief omitted.

For your keenness, you are hereby drafted to review project UDID. UDID (talk) 04:08, 28 July 2016 (UTC)

Not sure what you need me to review! Your project (while seemingly beyond massive) is pretty straightforward. One question — are you another user taking on a side project? No need to reveal your identity if you don't want :) Bob Moncrief EBDW! 04:10, 28 July 2016 (UTC)
Format, presentation, scope, et cetera. Once the basics are nailed down, populating this with data should be fairly straightforward (think webcrawler/wikibot), and it should be pretty simple to extend the format if desired. Notably, plan is for people to be able to submit record for character (e.g. images, *witness records, &c) and have them be publicly visible – think a wiki-style Rogue's Gallery, but open to all manner of records and (hopefully) not subject to the same fate.
As to your question, that would be a reasonable assumption. :) UDID (talk) 04:35, 28 July 2016 (UTC)
Hmm. The formatting etc. seems reasonable. Might be sensible to (at least for now) focus on active characters. But a bot should be able to add the basic profile information without too much difficulty. Bob Moncrief EBDW! 10:56, 28 July 2016 (UTC)
Too hard to know who's active. Could be determined programmatically, but would be several orders of magnitude more difficult than just scraping the entirety of the visible profiles and adding them to the wiki.
Would you mind if we moved this discussion to my talk page? Strikes me having this easily visible would remove the necessity of going over it again. UDID (talk) 11:21, 28 July 2016 (UTC)
Feel free! That was about it for my comments for now. I'll make sure to put further ones on your talk page as they come up. ^_^ Bob Moncrief EBDW! 11:42, 28 July 2016 (UTC)

How?

Hi,

Interesting project to say the least, how do you plan on going about this? PB&J 06:59, 28 July 2016 (UTC)

Thank you! See discussion (above), but basically webcrawler/wikibot, which is why I'm soliciting suggestions for formatting, etc. (Should probably use templates to easily format this all, for starters.) UDID (talk) 11:25, 28 July 2016 (UTC)
Templates would be the easiest yes, although I'm curious to hear how you'll handle the general structure: one giant list is not something most browsers would like, nor would it be very useful. Alphabetical? Numerical based on character ID? Etc. PB&J 11:28, 28 July 2016 (UTC)
I assumed there wouldn't be a master list, just a navigable set of subpages. E.g. if you wanted character #1989226, you'd go to User:UDID/1989226. Bob Moncrief EBDW! 11:44, 28 July 2016 (UTC)
That is indeed the plan, although as character names are unique, a by-name cross-reference is also planned (e.g. User:UDID/N/Bub). The wiki search tool will mean we'll have full-text search available, and that plus substring page title searching should make it the equal or superior of any profile search tool yet.
Getting the exact data/page structure right is going to be an important part of this project. Already wondering what the best way to do this is— sub-subpages, like DangerReport? That would have the advantage of allowing people to reformat the data as they like without being tied to a specific presentation format. Disadvantage: lots of tiny pages, large amount of edits, lots of moving parts. As am already planning to use automation, possibly better to save that for a future version 2, and focus on getting a workable version 1 up and running— that way, if I disappear at least there'll be something for people to go on with. UDID (talk) 12:12, 28 July 2016 (UTC)
I'd definitely go the DangerReport route right from the start. It lets you create a page that contains JUST the facts about the person's profile (must-haves: last updated, ID, name, join date, and class; nice-to-haves: XP, group, real name + link, status, number of deaths, description, skills (boolean value for each)) in a highly-structured format. We can set up a basic template for how those can be displayed in a reasonable way right from the start, while also leaving the option open to others to display them in different ways.
Each of those pages (which would likely use the player's UDID for the page name) would then be included in a second page (i.e. like the location pages) that displays the template and has a space for unstructured content. This second page would likely be named after the player's character, and it'd be where anything not directly related to the profile would go, such as screenshots, dumbwits, comments, etc., that way we'd have a clean separation between NPOV facts and everything that may not be, which should help make it easier to deal with people who try to vandalize the various pages.
Also, despite it being in your userspace, I expect that this will be treated as a space where people can freely update the data with more recent information, just like the DangerReports? Except that unlike the DRs, which are treated as NPOV, they'll also be able to add some POV stuff like using it as a Rogue's Gallery? If so, I'd suggest putting the rules for how your userspace are going to work in writing on your userpage, that way we (sysops) have a way to deal with questions when they arise and you're not around to say whether someone's edit conforms to what you allow. Because I suspect that something like this is sure to be controversial at some point. Aichon 15:13, 28 July 2016 (UTC)
I really should have looked at your userpage and example first, since I see now that you've already done some of the things I suggested. >_< Aichon 15:20, 28 July 2016 (UTC)
Ha! No problem; if nothing else, it's nice to know someone else is thinking along the same lines and reaching conclusions that are not too dissimilar. :) All right, you've convinced me – agree about taking the DangerReport route, if only because trying to figure out how to do what I wanted without it was giving me a headache. Rules/methods for annotations are still pending, but I'm thinking clever templating is probably the way t go with that.
(It has just occurred to me that this project could almost be considered an attempt to re-implement the "new" Rogues Gallery in wiki. Amusing.) UDID (talk) 10:45, 29 July 2016 (UTC)

How fast do you think you can get it all done, if you get the format right? Bob and me discussed this shortly over IRC, since there are about 2.5m unique characters ingame. Letting the wiki-bot add them all in one go will probably crash the wiki, if a lot of template calls are included it will without a doubt crash the wiki. Spreading it out over x number of sessions seems like the way to go, but I have no idea what the wiki can handle in volume and traffic. PB&J 14:38, 28 July 2016 (UTC)

Good question. Without a doubt the fastest method would be simply to dump the (HTML) contents of the ~2.5M character profiles and then upload that directly to the wiki. Of course, that would be of severely limited use, so I'm thinking I'll preprocess the profiles and add only the relevant information. In the current format, it would be roughly the same amount of edits; much less data, but probably the same impact on the server. If we go with templates à la DangerReport, I wouldn't really care to estimate based on what little is known about server performance — which is why I intend to utilise an API-compliant wikibot, test it before putting it to use on more than one page at a time, and make short test runs while monitoring server load so I know what sort of edit rate is sustainable. Arranging to have an admin on-hand to block the bot if necessary could also be a worthwhile precaution.
Bear in mind we have only a concept at this point. Might be able to get started on (non-wiki) coding this weekend, might not. It would help if we could figure out at least an approximation of an data/template structure with which to make this data as usefully accessible as possible. :) UDID (talk) 10:45, 29 July 2016 (UTC)
So if I'm correct, you're going to scrape the UD profiles, have them processed locally and then upload via wiki-bot? The simplest way I can think of to structure them would be to use a simple profile template (maybe [[ ]] brackets around their group affiliation?) and have the scraped data sorted by category. Category:UDID/A, Category:UDID/B,..., Category:UDID/Scientists, etc.
At work currently, so in a rush, but I seriously hope I'm making sense. PB&J 11:17, 29 July 2016 (UTC)
Making sense? Sure! :)
I'm thinking splitting things into one-item-per-subpage and then aggregating via template makes entirely too much sense, especially since this wiki appears to lack advanced string-handling capabilities. Currently working on figuring out a basic page/data structure that is consistent and at least somewhat optimised.
Categorisation is definitely in the plans but seems like something a later pass can handle easily – no need to deal with that on first-pass. :) UDID (talk) 11:40, 29 July 2016 (UTC)
I'll try to whip a barebones example page out today that should demonstrate a simple structure for the data and a simple template to display it. Aichon 14:57, 29 July 2016 (UTC)

mein gott

Would we have to protect all these?--Rosslessness ; the shambling custodian of UD's past... 19:43, 28 July 2016 (UTC)

Surely not. I'd envision them ideally being initially populated via bot, then updated afterwards by any user interested in keeping them up to date. They'd be a prime target for vandalism, I'd imagine, but they wouldn't need protection. Aichon 21:27, 28 July 2016 (UTC)
Why bother? An attractive nuisance has all sorts of uses. ;) UDID (talk) 11:25, 29 July 2016 (UTC)

Data format and example uses

Okay, I've tossed together some examples, to highlight a variety of things that can quickly be made if this data is provided in a standardized format. The example data itself is for my survivor character and is located at: User:Aichon/Sandbox/Demo15. You'll need to click Edit to see the data, since it defaults to displaying one of the templates, but all of the templates in the example page are built on top of the data, which consists solely of the information that is currently publicly accessible for that character (hence why there are gaps in the data, since some of that data is only available if my character is dead).

As I built out the data, I found that there were at least a few additional variables that hadn't been mentioned yet (e.g. separate descriptions for if you're alive or dead), so I added the ones I noticed and tried to structure things in such a way that the most frequently changed values (e.g. status) are at the top, one-time changed values (e.g. skills) are in the middle, and never-changed values (e.g. id, name) are at the bottom, that way it's easy to keep things updated without having to read through the whole file.

A few things:

  • I'm not super happy about the variable names I chose for the skills. On the one hand, I want them understandable for wikinewbs. On the other hand, if the format is used millions of times, then every character counts, so going as short as possible is better. I tried to strike a balance between the original, full skill names and keeping them short, but I'm open to suggestions.
  • We'll need to wrap any user-editable fields (e.g. group name, description, etc.) in <nowiki> tags or else scrub the data first to make sure that wikicode in those fields doesn't break the wiki. The real name link in particular will likely need to be scrubbed, since I don't think it will be usable if we wrap it in nowiki tags.
  • I separated the real name and its link into two, separate variables, that way templates that build on the data can choose whether to use one or both of them.

Anyway, don't worry about any of the display templates too much, nor the details about how the data is getting displayed, since those are all just examples and details that can be easily changed later, rather than final products. The only thing that we need to lock in now is the data format itself, that is, we need to lock in which variables to include and how they should be named. Everything else is secondary at the moment. Aichon 17:39, 29 July 2016 (UTC)

I've added a few more example templates and also have created data for my zombie as well, that way we have some more examples of the data at work. Aichon 18:45, 29 July 2016 (UTC)

Oh, very nice. Wouldn't even have thought of having a template switch work like that — that's much better than a subpage for every item. Definitely glad I brought this here first rather than just getting straight to it. A few thoughts:

  • Separate dead/alive descriptions— definitely intended.
  • If there is a conflict, probably better to err on the side of usability and maintainability rather than over-cleverness.
  • Number of edits is going to have a much bigger impact on the wiki than a few bytes here and there in variable names. I was thinking of putting skills all in one variable and parsing them out (in the interest of not have an additional ~45 subpages), but your solution makes much more sense and is going to be much easier for people to use.
    • On that note, |died= could probably stand to be |diedcount=, just to remove ambiguity as to what it is and why it might be empty. (I confess, this is from personal experience just now. Better to make it absolutely clear now and head off any more confusion later.)
    • A minor usability tweak in formatting I would make, which is standard at Wikipedia (among others) in long templates, is for the formatting of each line intended to be editable to be like so: |variable= value. That extra space might seem minor, but helps prevent a lot of munged lines, and makes the filled template much easier to read as it separates structure from data, even to the untrained eye. :)
  • "Real name"/link were always intended to be stored separately - apart from avoiding certain problems, that means the data can be used separately if desired.
  • Possibly the best compromise for user-editable fields is to "break out" that section to be on its own line in the wikicode, wrap it in <nowiki> tags, and have instructions in HTML comments that let people know to only edit between the lines. It's not ideal, but it beats any other solution I can think of offhand.
  • Those example templates are fantastic and inspiring. :D
  • This templating method largely solves my other concern, which is how to make all this data easily bot-accessible.
  • It would probably be fairly easy to mock up an imitation of the default UD profile page, or a more concise "dossier" format, but I am happy to leave that to those with more talents in those areas.

Thank you very much. Now I can throw away the half-arsed post I was working on in another tab and it will never have to be seen by any eyes bar my own. :D UDID (talk) 23:27, 29 July 2016 (UTC)

NB I do intend to run the proposed scope of the project past Kevan before seriously getting underway. While as far as I am aware he has had no problem with various persons scraping and indexing profile data before, this would potentially have rather a large impact on the wiki and it seems both courteous and pragmatic to have him aware and onside if possible. UDID (talk) 23:44, 29 July 2016 (UTC)
Regarding user-editable sections, my concern was actually that a player's in-game profile might contain wikicode or other special characters in those user-editable fields, particularly if this project takes off. The bot would then dutifully fill it into the appropriate variable and things would start breaking. I agree that we should make the sections readable and use HTML comments appropriately, but wikinewb editing is a separate matter from what I was talking about. ;) Aichon 00:36, 30 July 2016 (UTC)
Oh aye, I got that, but it's much easier to train a bot to correctly use escaping than it is a human. :P This should be part of what gets ironed out in the test phase, but making input wiki-safe was more or less a solved problem last I looked at wikibots, so this shouldn't (fingers crossed) be an issue. UDID (talk) 11:05, 30 July 2016 (UTC)

Firstly, I love this. Secondly, if you formatted the profiles using the ud style template so they looked like profile pages would it wreck the wiki? Thirdly, not to throw a spanner in the works, but some profiles have some extra information you might want to include. --Rosslessness ; the shambling custodian of UD's past... 21:13, 30 July 2016 (UTC)

To answer your points in order:
  1. Thank you.
  2. Yes: immediately, irrevocably, and irretrievably. :P (I can't actually find any such template, but such a thing is certainly very much doable and on the drawing board for once the dataset is sorted out.)
  3. Good catch! Definitely worth adding the Monroeville status badges (up to two, as far as I can tell), and was there an equivalent for Borehamwood? (Doesn't look like it.)
This is definitely supporting the "FIRST dump entire pages, THEN process for wiki" workflow. Thank you for the contribution! UDID (talk) 04:08, 31 July 2016 (UTC)

Location

Question (open to anyone)
Do you think it best to keep the project itself in userspace à la User:DangerReport, or to move it into mainspace, possibly under/as an expansion of UDID (which links to http://profiles.urbandead.net (which is "Account Suspended"))?

--UDID (talk) 23:54, 29 July 2016 (UTC)

I'd keep it here, particularly if you intend to tack on Rogue's Gallery-type features or other POV comments and the like. Those sorts of things wouldn't work well in the mainspace. Aichon 00:22, 30 July 2016 (UTC)
Userspace, definitely. Trying to keep this in mainspace would a) open us to a rogues- or zerg-gallery like thing being (partly) sop-controlled, which the sop team has consistently opposed; and b) be an NPOV nightmare. Bob Moncrief EBDW! 00:44, 30 July 2016 (UTC)
Good good, just making sure I wasn't missing any pressing reason to mainspace it or making some horrible mistake [by userspacing]. Thank you both! UDID (talk) 11:08, 30 July 2016 (UTC)
The UDID userspace seems like an excellent place for it, if not for the above reasons, then to avoid confusion. PB&J 23:52, 30 July 2016 (UTC)
Yes, I suppose having user information in userspace does have a certain logic to it. :) UDID (talk) 04:00, 31 July 2016 (UTC)
Not exactly what I meant, it's just that the "User:UDID" prefix in the page titles will become its own clear pagetype, in the same way the Danger Report ones did. As soon as you're on one of those pages, users will know what they can/can't/should/shouldn't do. :) PB&J 07:45, 1 August 2016 (UTC)

Update

Confirming this is still happening. (See #Changelog.) UDID (talk) 07:18, 19 August 2016 (UTC)