Some of the problems i know about:
- it removes semicolons after closing parenthesis.
- it removes backslashes, one at a time. That is, if you have a double backslash one of them stays until you edit the post or it is quoted.
- it completely ignores the number of spaces in code tags
- quotation marks followed by a parenthesis usually causes a smiley (More specifically the smiley that comes from semicolon + closing parenthesis. That's probably because the quotation mark is broken down into a string of characters that ends with a semicolon and then the parser ******** up.)
- there's some more funny behavior with certain characters, but i can't remember any specifics atm.
I need to know what is wrong with the parser lazner wants to know and i really do not know how to explain it that great can someone tell me so i can pass it on to lanzer now in this ask the admin.
before people say this cant go here.My question is what is wrong with the parser so i can explain to the admin for us. After that it is the last of my matters i wanted to bring up to the admin.
You could just refer to the post I made in Spring Cleaning two goddamn years ago. It lists the majority of the known issues and makes some suggestions.
None of the issues I listed then have been fixed. There has been at least one issue introduced recently, but I haven't spent the time to identify it.
Everything on the list stems from the way the BBCode parser scans with poorly-written regexes instead of actually parsing, the Smileys parser is done as a separate pass with disregard for surrounding content, and the "anti-XSS" filter does not do what it's supposed to do at all.
In fact, the anti-XSS crap is completely wrong:
It does an unnecessary HTMLEntities pass that mangles actual HTML entities (for example, you can't use the less-than or greater-than symbol in the vicinity of text without it being mangled into an ampersand by overzealous HTML removal)
It treats % when followed by any two alphanumeric characters as a URL-decode and then does it wrong
Backslashes, semicolons, and a few other things disappear when posts go into the server's page cache (so four backslashes look like four until someone else views the page, then it becomes two, until someone quotes it, where it becomes one, then disappears as it enters the cache again)
... And a dozen other problems I can't be arsed to list here.
It's not simply broken. It needs to be completely rewritten by someone competent enough to write an actual honest-to-god lexer and parser. It's not that hard, and it can be written to run very, very fast.
well that is a great way to put it i just wanted to ask to see if the question of did i have the ok from the maker to copy any of this ever came up i was told one time to get the ok from the maker before you use any part of there topics.Lanzer was unable to help me last night but i am going to go round two next week with the new information i was given in this topic.Any more what ever you feel fits with this please post it here and i will pass it on i am going to make a question out of the facts in this topic in the day day or two.
I just want to let everyone here know i have a topic i made in CB of what the admins are telling me about my questions/matters for the C@T but i will post what i get back from them next week when i ask my new question about this.