It complicated the way I wrote things. I might have to redesign the code. Maybe I coded it poorly and can come up with a better way.
Sometimes I pass the strings back and fourth between functions. I'll need some other test in my validators to see if it's a pass between functions or the first time it's being used....
I've looked and looked in other code very few people seem to be taking all these precautions. I'm sure having trouble trying to guess at how to create these validators, what with not knowing how attacks happen.
You guys assume I just know, but honestly I don't know they exploit these functions. If I could find examples I could test what I'm doing.. I hate guessing... lol
You've been very helpful Raven. The problem lies with me. I simply don't know what I'm doing.
What might be common knowledge to another person, I may not have stumbled accross yet. *shrugs*
The extent of my knowledge lies within 3 php books. Each devotes one chapter to mySQL Querries. None of them talk about security issues. They seem to just assume that your on a secure site or something. I don't get it intuitively. I need examples. lol Then I can write a function and start testing, I test a lot........ lol
Without clear examples of what I need to trap out of my strings I'm kinda lost. If I had to I could sit down and write a C like function that stepped through my string one character at a time and took everything out that had to be out. My problem is not knowing all the ways I could get in trouble, what to leave and what to take out.
Most of my strings are simple. Straight text. No html, no php... I'm not haveing any trouble with these strings.
However, I need one of my strings to allow the html in the admins allowable list. This is the one that's causing me grief..
Thanks for the link, I'll read it now.
I've already read all the stuff in the online php manual about addslashes, and stripslashes...
I get variables from three sources like most everyone else I assume.
a database, user input, and internal code manipulation.
I actually have some code working now, but without any validation. Last week I began trying to secure it. I was trying to use the internal functions in main.php, but I can't figure them out.
This partially worked for strings that were not supposed to have html.
Code:
$title=FixQuotes(filter_text($title, "nohtml"));
But it's not 100% right and I could not figure out why after about 8 hours or searching. For one thing, FixQuotes does not do the get_magic_quotes_gpc() test. It just starts doing the ereg. I had concerns about that. Furthermore, the filter_text does NOT strip out words that are in the disallow list. I was never able to figure that out either. The no html seemed to work, but I never finished testing what would happen with php commands.... all these issues from the easy one.....
I have one user input string that needs to allow html. It's a large body of text or html their choice. I may have to add a check box for them to tell which it is (text, html) but I had hoped I could figure it out on my own...
This has all the problems already mentioned, plus one more. The backslash seems to be dissallowed in one of those long complex ereg_replace statemetns in main.php. It's striping everything past a back slash.. Even if it's unobtrusive... like in lists where it's not a problem. this / that / the other.
Everything after the first / is stripped out of the string. ?!?!?
I'm in sad shape if I can't even figure out how the internal functions work.
That's when I desided to try and write my own, at least the one about testing if magic quotes are on or not. I'd still like to use the admins allowed list of html, and dissallowed list of words... It seems to me I should use his list's..
I have read the php manual about htmlentities(), but I havent messed with it. I've been determined to get the nuke routines to work. Seems like they should.
But when I do the same exact call that say the news module, or the encyclopedia module does, I don't get the same results. I can't see what's different..
My validators are accepting the allowable html. They are substuting the dissallowable words with the specified substitution character. Single quotes and double quotes seem fine. Forward slashes seem alright.
The last thing I'm having trouble with is the backslash.
It's unconditionally stripped off no matter what. Even when it's alright to leave it there.
Here is right after I read a variable from the database.
It's striping slashes that are in the middle of no where. Like say this \ and \.
filter_text calls many internal functions, check_html() is one of them that strips the disallowed html tags, right off the bat they stripslashes() why are they doing this there? It dosent seem like he should just automatically stip slashes. I still don't understand.
There has got to be a way to make the internal functions work.
Only registered users can see links on this board! Get registered or login to the forums!
They had to write internal functions to reverse odd effects of native functions very much as your describing. Might give you a steer in the right direction to see how they handled it.
I've been playing with it a little tonight and like the way it works. I think all I've left to get into the safe array is text color.
Now I just have to hack htmlarea so it uses <br> instead of <p> tags or something. The <a href tags are stripped from what little documentation there is looks like this is why. Changes them into <A></A> for me now.
Otherwise it doesn't miss a beat cleans evil just like it claims lol. I feel about as safe with this allowed array as with bbcode. And its as easy or easier to add new tags and attributes aside from the issue above.
So his code is GNU General Public License (GPL). Great, ah.. But what exactly does that mean?
I read the license, and don't think that was an easy task! lol
It talks a lot about what you can and can't do to redistribute his code, but I did'nt see anything specifically about using it as an internall part of code that you intend to redistribute.
What exactly must we do to use his code in something were going to redistribute? Is that considered the same as redistributing his code? If I'm reading it right then we are required to keep his documentation and license inside our distribution as well.
I was wondering if you knew what this particular element of the array means.
Code:
'dummy' => array('valueless' => 'y')),
I read the doc's and I still don't understnad.
Quote:
'valueless' checks if an attribute has a value (like <a href="blah">) or not
(<option selected>). If the given value is a "y" or a "Y", the attribute must
not have a value to be accepted. If the given value is an "n" or an "N", the
attribute must have a value. Note that <a href=""> is considered to have a
value, so there's a difference between valueless attributes and attribute
values with the length zero.
What the hecks dummy?? There's no attribute like that...
Then I noticed with the textFilter() I had to use FixQuotes or things went haywire. When using
$subject = textFilter(kses(ADVT_stripslashes($subject), $allowed) );
Code:
/* Added for quotes with textFilter(KSES */
function FixQuotes ($what = "")
{
$what = ereg_replace("'","''",$what);
while (eregi("\\\\'", $what)) {
$what = ereg_replace("\\\\'","'",$what);
}
return $what;
}
function textFilter( $str )
{
$str = FixQuotes( $str );
$result = mysql_query( "SELECT word FROM badwords" ) or error( die("Badwords Broke"));
while( $row = mysql_fetch_array( $result ) )
$str = eregi_replace( $row['word'], "\*", $str );
return $str;
}
None of which really answer your question sorry I really couldn't make sense of what the author was trying to explain about the valueless Y y.
Thanks for taking the time to post what you did post. I appreciate the oppertunity to see your allowed array.
You mentioned it getting confused with quotes. I've only made a few texts, but what tests I made it handled them right. In what example input string were you able to confuse it with quotes?
As far as the
Code:
'dummy' => array('valueless' => 'y')),
goes. My best guess is that if hypothetical attribute dummy has a value then it will be stripped from the input string. If it does not have a value it will be allowed in the string.
On the other hand if the flag had been no, I'm assuming it must have a value or it will be stripped.
When the value 'word' was passed from the database with magicquotes on it wasn't being stripped before. Because 'word' is replacing the original string. I'll have test that a little more maybe thats not the cleanest way to handle it.
I couldent get this one to work as is. I believe there is a syntax error in the min and max values. In the way your applying them. I'me sorry to say I don't understand it this way either.
From my humble understanding this is a two dimensional array, and your adding the min and max value out of context here. What element of the first array would your min and max value apply to?
My last question for the moment is about the link attribute your allowing in the a tag. I've never seen link inside an a tag. Is that something that people do? I kinda work alone. My experience is very limited to what ever I'm trying to do at the moment. I only ever used LINK to add style sheets and stuff.
I'm not nit-picking, honest. I'm just self taught and I want to get this right. It's really important to me. I'd like to understand your reasoning for things.
What I don't like is that if max length is exceeded it strips the whole title not just trimming off the length to 100.
Edit: And to wrap it all up I do think align is a proper attribute of <pre at least I know it will work. Is there someplace that says its an illegal attribute?
Edit:
As long as we're at it I cleaned up font size too. If you see anything else feel free to critique away! I went through that faster then I should of and really you've helppd me more then the other way around
Probably the table attributes need corrections too but I am giong out of town for a couple days so I won't be back for a bit.
So now I have this:
With respect to your first example of an a tag in the allowable array, if I helped in any way I'm very glad!
As far as the align attribute within a pre tag. That's news to me. I've never seen it before. I actually tried align="right" and nothing happend with my particular browser. We should find that URL on the web and run it through and see if it's WC3 compliant or whatever they call it. I don't mind doing it if you have the URL to test it?
Setting a maxlen to small can be a problem, because as you've pointed out, it completly strips the tag if it's exceeded. In the case of an image tag's src attribute this would be bad.
On the other hand, it offers a lot of protection form attacks where they are trying to overflow the buffer.
I'm happy to keep maxlen set moderatly high, document it, and live with it.
I'm going to go thru every tag and attribute and manually test all possibilities I can think of.
I'll post my array when I'm more comfortable with it.
Before I run off to tinker with this though, I do see one problem with your font size.
You have it set appropriately for min and max interger values of 1 to 7, but it's also valid to say +1 or -1 etc..
Perhaps we'll have to settle for a minlen of 1 and maxlen of 2 for that attribute.
In my usuage size values are strictly limited (htmlArea) to 1-7 where actually 1 = 8px and 7 = 36px. I get a +0 when trying to adlib with 8 or higher. Beats stripping the tag completely like maxlen did At least that was my goal to get the two to jibe. My maxlen seems short but the total allowed chars each post is only 1000 I think I'll have to double check that. But that was why I kept it trimmed down. I might reconsider that and increase the overall post allowed size its a text column so it would be ok to a much greater value.
<pre align= documentation if that helps at all?
Shows align attribute
Only registered users can see links on this board! Get registered or login to the forums!
Shows width attribute
Only registered users can see links on this board! Get registered or login to the forums!
I wouldn't call this proof positive its proper, but at least it shows someone else thinks it is.
I have a problem with backslashes or sometimes repeated quotes being added to my text in french where there is a quote. A way back, someone suggested a fix that can be added inside a module to remove the effect of backslahes or additional characters being added in the text. For example, typing C'est will yield C"est or sometimes C\'est. Really weird stuff.
Does anyone know how to add extra information inside index.php or the module so that it will not "translate" the characterers?
View next topic View previous topic
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum