I’ve been using the Yahoo User Interface Library (YUI) in my web app, and one particularly cool component is the YUI Rich Text Editor: its cross-browser compatible, fully extensible, and best of all it’s free :)
For all its greatness, one thing I’ve struggled with is that if you copy & paste stuff from another app into the YUI Editor, all of the original formatting is maintained. Most of the time for us, the “other app” is Microsoft Word, which does a particularly heinous job of generating HTML from a formatted document. This almost always wreaks havoc if the user subsequently tries to change text styles in the editor, as the underlying HTML is a total mess.
So, the solution for us was to try and strip out all of the formatting when somebody pastes stuff into the Editor, resulting in nice clean HTML that plays well with the YUI Editor formatting functions. Now, unfortunately no such feature exists in the YUI Editor, and Googling around just led to dead ends, so I was left to build my own.
CleanPaste for YUI Rich Text Editor
For information on what’s supported, please read the notes on CodePlex, as I will keep this updated as I make bug fixes etc.
To use the CleanPaste script, follow these steps:
- Ensure you’ve already installed the YUI components & created a Yahoo Editor on your page.
- Place the CleanPaste.js file somewhere in your project.
- Include the script in your page using the following code:
<script type="text/javascript" src="CleanPaste.js"></script>
Ensure the src attribute points to the directory where the CleanPaste.js script is located.
- In the Javascript where you create your Yahoo Editor object, create an instance of the CleanPaste object, passing in the editor as the parameter:
var myEditor = new YAHOO.widget.Editor('editor', myConfig); myEditor.render();
var cleanPaste = new CleanPaste(myEditor);
That’s it, the editor should now strip the formatting out of pasted text.
Of course this is still under development, so if you have any problems or feedback please post on CodePlex and I will get back to you.
Cheers, Anthony.
64 comments:
Hello, my name is James Star. Works well on IE, but not Google Chrome. Appears to duplicate. Is this a major fix?
Thanks for the feedback. I'll test in Chrome and see what I can do. Anthony.
The issue with Google Chrome has been resolved, you can download the latest library from CodePlex. Anthony.
Anthony - you are brilliant :-)
is it possible to use this without including the entire YUI library? I primarily use jQuery, so it would be a big drag
Yes, the CleanPaste script does not require any additional YUI libraries that the rich text editor doesn't already require. It really only needs the yahoo-dom-event library.
Anthony.
Great job, Anthony :) Just a minor problem: could you replace the in the copied text with spaces?
Hello again,
I have some problems with Firefox (cur. last stable version: 3.0.10), as it does not strip the first ugly block. The problem appear when pasting from Word Viewer 2003. A patch/workaround for this to work in Firefox (and other stuff like HTML comments) is to replace in your code the last part with this (notice the whitespace \s char, which include the NewLine character):
html = html.replace(/<(\/)*(\\?xml:|meta|link|span|font|del|ins|st1:|[ovwxp]:)((.|\s)*?)>/gi, ''); //Unwanted tags
Something else:
probably you should consider also this rule:
html = html.replace(/(class|style|type|start)=(\w*)/gi, ''); // Unwanted sttributes (class=CLASSNAME) (different from class="CLASSNAME")
Hi Andrei, Thanks for contributing to the project. I've tested your code and committed a new release to CodePlex with the changes. Cheers, Anthony.
Thanks for the script, really helped us out.
The only thing that I ran into was a timing issue in FF and IE when pasting.
No real testing, but, it seems when I pasted a large chunk of text, there was a chance that the timer would run before YUI had put the text into the container. This usually resulted in either a null container, or (strangely) an underscore (in FF).
Anyways, as a hack, I added a timer variable in the prototype, and wrapped the CleanPaste code in a try/catch. If an error was thrown, I incremented the timerCount, if it was less than 5 (an arbitrary value on my part), I reran the CleanPaste method. Seems to work fairly well.
Thanks for the feedback, I'll put this fix into the next release. Cheers, Anthony.
Anthony, ran across this while working on my project. I was able to get it running, but had a question. I copied some paragraphs from Word into the editor and alerted the editors contents after the paste.
I still see some text like the following in some P tags:
P class=MsoNormal... text here etc
Are those Microsoft class names still supposed to show up in the filtered text?
Good job on the utility.
Thanks, DMc
Things like class=MsoNormal should definitely be stripped out. If you paste your text into the example html file provided, does it also not work?
If it doesn't work, can you please add an issue using the issue tracker on CodePlex, and attach the word file that doesn't work so i can try and resolve.
Anthony, false alarm. I figured out the issue and it was on my end. What does your utility do that YUI doesn't do when setting the filterWord configuration attribute to true? I noticed your example also sets this attribute anyway, so I was just wondering.
Thanks, DMc
The filterWord function built into the YUI editor only fires when a document is initially loaded into the editor, it does not fire when additional content is pasted in by the user.
Secondly, the filtering that the YUI component performs only removes a small amount of Word garbage, hence I had to add a bunch more filters to fill the gaps.
Awesome. Thanks again, DMc
Hi, thank you so much for this script. It really saved me here. It's very smart in that it only strips the garbage but leaves the good things (e.g. lists).
Hi Anthony. Thanks for sharing this. It appears to be exactly what I'm looking for, but I can't find the download on CodePlex. Has it gone or am I looking in the wrong place?
Hi. Click on the "Source Code" tab in CodePlex. From there you can download the latest copy. Cheers, Anthony.
Hi Anthony,
This script is very useful as there are very little or no resources available to achieve the result by just using YUI RTE.
I have came across a particular test case in which your script fails completely. The div with an id 'Cleaner' is not being inserted as a result the whole code fails.
If I insert some content in the editor textarea that starts with a p element like the following <p> </p> followed by the actual content of the editor that I want to load, the script fails.
Hi,
Can you show an example on how you listen to the paste event on the YUI editor?
Hi JP, can you please attach a sample HTML file that causes the script to fail using the Issue tracker in CodePlex. I'll then take a look. Anthony.
Hi csjtheman, unfortunately there is no single "paste" event supported by the rich text editor, so I had to use a number of hacks to achieve the same thing. You can see this in the init() function.
That said, I could probably get my script to expose a paste event of its own if you think that'd be useful...
Thnak you for getting back so soon.
It would be extremely useful to me with a "paste" event, as I'm making a CMS system with the YUI editor as a base.
Right now, the users can paste all kind of weird html into the editor, and it will be displayed like that in view mode, ignoring all my carefully crafted styles :-)
So I - for one - would very much like to see such an event.
Kind regards,
Christian Sonne Jensen
Hi Christian, I've added 2 events to the script: OnBeforePaste and OnAfterPaste. You can see how to use them in the Example.htm file provided with the script.
Anthony.
Works great, thanks!
Hey great script! But if I copy this code:
{intro assets}
The following questions are about assets and loans.
The space between 'and loans' is gone.
this happens once in every sentence .
Any ideas?
I tested your sentence above but it appeared to work fine for me. Can you please attach the original HTML as a file using the Issue tracker in CodePlex and I'll try it again.
Anthony.
Thanks Anthony, it worked like magic :) ..
Anthony, great job on the script! I have found an issue for me - the line to stop the context menu for mozilla browsers seem to be causing an issue in IE7 - it is clearing all the content when you right-click in the editor.
Thanks for spotting this, I'll take a look and see what I can do. Anthony.
When I copy a paragraph from Word, the sentences automatically wrap. How do I fix that problem?
Hi GuruFocus, sorry I don't understand the problem. Perhaps you could take a screenshot of the problem and create a new issue on the CodePlex site. I'll then take a look.
Cheers,
Anthony.
Thank you for the quick response. I copied a paragraph from MSWord, and posted, then I pasted in the editor, there is some format info left .. (You blog does not allow me to post those as it considers them as html code)
Also if I copy a paragraph like like:
MICHAEL HARTNETT: Yes, I think a little bit. Everything looks, technically, a little extended. But I’d still be a cyclical bull. I think the bottom line this year is, for me, the fundamental valuations are not so important. You just had an unprecedented bear market, an unprecedented macro meltdown, an unprecedented sort of policy response and I think you're in the midst of unprecedented rally in risk of the back of very oversold levels. And I think that we're not at the end of that risk rally and we won't be until the central banks end their quantitative easing policies.
It became:
MICHAEL HARTNETT: Yes, I think a little bit. Everything
looks, technically, a little extended. But I’d still be a cyclical bull. I
think the bottom line this year is, for me, the fundamental valuations are not
so important. You just had an unprecedented bear market, an unprecedented macro
meltdown, an unprecedented sort of policy response and I think you're in the
midst of unprecedented rally in risk of the back of very oversold levels. And I
think that we're not at the end of that risk rally and we won't be until the
central banks end their quantitative easing policies.
You can see it created a lot of new lines.
Maybe you can test it here:
http://www.gurufocus.com/test/test_yui_bbcode.php
Hope you can help. Thanks!
Hi, Anthony!
I was using Firefox. When I use IE, it works perfectly.
Can you help with IE?
Anthony,
If I use CleanPast in IE, it works fine. But in Firefox, the style, xml, object codes are still there. am I doing something wrong?
Any help is gratefully appreciated.
Thanks for this script!
If you bold some text in the editor and apply formatting to it (let's say, bold) and then copy and paste that, it is inserted into the rte as strong. At that point you can't unapply the formatting since it is expecting a bold tag. Of course it will save it as a strong tag when you submit anyway...
In any case, fixed this by adding html = this.Editor._cleanIncomingHTML(html) after your html = this.Editor.cleanHTML(html);
Regards,
McKinley
This code seems to invoke a problem if some text have been selected and the user left clicks to copy it.
then it replaces the selected text with a underscore. which is far from ideal.
Any suggestion how to avoid this behaivour?
Great Stuff. Many thanks for this. I found that I got better results using the multi line replace described here:
http://wolfram.kriesing.de/blog/index.php/2008/javascript-multiline-replace
Hi GuruFocus, sorry for the long delay. I tested the paragraph you sent me in Firefox (v3.0), however I did not experience the same problem. Perhaps you can attach a sample word document to the Issue Tracker in CodePlex and I'll try that.
Regarding your second problem about xml nodes etc still being there, it sounds like you have not correctly initialised the clean paste script. There is a sample HTML file included with the script, try pasting your text into that and see if the problem still occurs.
Cheers,
Anthony.
Hi Lars, sorry for the delay. Can you tell me which browser & version you are using where the problem occurs?
Thanks, Anthony.
Hey Anthony,
Using the latest FF and have confirmed it in FF 3.0 as well, I have identified that it happens when you use the contextmenu event.
Thanks Lars. I tested in FF 3.0.15 (Windows) by left-clicking on some selected text, however it worked fine for me.
Are you on a different platform? Does the problem occur for you in the Example.htm provided with the script? Or only in your implementation?
Anthony.
Anthony,
Thanks for your efforts with this they are most appreciated.
I'm getting similar behavior to Lars. I'm using IE 8.0.6 on WinXP with your unchanged example.htm
Select a word and right click it - the word is replaced with an underscore on a new line. Using the keyboard (Ctrl+X) works okay
FF 3.5.3 the right click is ignored
Chrome 3.0.1 - When I right click, the word is deselected and the context menu is not displayed
Opera 10.01 - Seems to work okay, although when I cut a word out using the context menu the text to the left of the cut word is shifted to the right effectively indenting the line?? weird??
All these borwsers work okay in a RTE without cleanpaste.
Cheers Al
Thanks Allan. There is code in the script to try and disable the context menu, as it contains a Paste item which would bypass the script.
I'll take another look and see what's going on.
Anthony.
Anthony,
It appears that with IE8 the onbeforepaste event fires on right click.
All the modern browsers that I've tested except opera 10 support the onpaste event
IE 8.06 - Yes
IE 7 - Yes
IE 6 - No
FF 3.5.3 - Yes
Chrome 3.0.1 - Yes
Opera 10.01 - No
Safari (on a Mac) - Yes
Would it be possible to test for onpaste event support before disabling the context menu?
Cheers Al
Anthony. I'm not a js programmer. I'm trying to use your code in an app that has multiple YUI editors on a page. I can see the CleanPaste.js in the header, but when I paste into the editor the text is not getting cleaned.
Hi Paul. See the 4 steps at the top of this post. For each YUI editor you need to add this line of javascript:
var cleanPaste = new CleanPaste(myEditor);
Where "myEditor" is the variable name of the editor.
Cheers,
Anthony.
Hey Anthony,
first I'd like to thank you for this great addon for the YUI RTE. It's very bad how often people copy and paste some texts from Word documents into online Richtext editors.
I have an enhancement for one regular expression found in the CleanHTML method of the CleanPaste addon:
At line 127 the RegExp might be extended to
html = html.replace(//gim, ''); // HTML comments
This will include conditional comments and multi line code in between comment lines.
Tested it in Firefox 3.6
Best regards,
tommy
Regarding the comments stripping, that regexp worked much better for me
/<(?:--[\s\S]*?--\s*)?>\s*/gi
Regards.
does it support simpleeditor as well?
Yes.
Heya .. thanks a lot for this script!
One issue i am facing is with IE8, wherein if i do a CTRL+V (it works on Context menu paste) and paste a text into an editor that already has some text, it is giing a JS error saying Object required (line 85). Also sometimes if there is text already and i paste some text, the whole text gets repeated, so each time i do a paste the text in the editor doubles.
Thanks for the great project here!
I experience the following error with Safari and Chrome (webkit?) from line 85 of CleanPaste.js using the example html ->
"TypeError: Result of expression 'container' [null] is not an object."
---
I'll dig into it, but want to see if anyone has travelled this path already :)
Thanks again!
Thanks Anthony - this looks cool. Do you have or know of a standalone version that will work with other editors?
Hi, I found that in IE when you make right mouse click test in RTF is removed. In other browsers right click is disabled.
I love you. This just saved me a few days of work!
For those with IE8 errors on line 85, switching to a setInterval method seemed to work for me. It looks like IE8 was taking too long to create the container div and didn't make it before the setTimeout occurred.
My code looks something like this:
containerCreatedInterval = window.setInterval(function() {
if (this.Editor._getDoc().getElementById("Cleaner")) {
window.clearInterval(containerCreatedInterval);
handlePaste();
}
}, 10);
var handlePaste = function() {
var container = this.Editor._getDoc().getElementById("Cleaner");
var sourceText = container.innerHTML;
var cleanText = cleanHTML(sourceText);
var newText = document.createElement('span');
...
}
The call to execCommand('inserthtml', "<div id='Cleaner'>_</div>") seems to fail if you paste text from Word into the middle of an existing paragraph. Any idea why?
Hi Anthony,
I am facing a issue with CleanPaste utility. I have modified the cleanPaste library for removal of most of the tags but whatever the contents i am copying from word document is not shown in same line which line it should be and breaked into multiple line.
these are the tags i am using....
// Remove additional MS Word content
// html = html.replace(/<(\/)*(\\?xml:|meta|link|span|font|del|ins|st1:|[ovwxp]:)((.|\s)*?)>/gi, ''); // Unwanted tags
// html = html.replace(/(class|style|type|start)=("(.*?)"|(\w*))/gi, ''); // Unwanted sttributes
// html = html.replace(//gi, ''); // Style tags
// html = html.replace(//gi, ''); // Script tags
// html = html.replace(//gi, ''); // HTML comments
alert("HTML"+html);
html = html.replace(/<(\w[^>]*) class=([^ |>]*)([^>]*)/gi, "<$1$3") ;
html = html.replace( /<(\w[^>]*) style="([^\"]*)"([^>]*)/gi, "<$1$3" ) ;
html = html.replace( /\s*style="\s*"/gi, '' );
html = html.replace( /]*>\s* \s*<\/SPAN>/gi, '' ) ;
var re = new RegExp("(]*>.*?)(<\/P>)","gi") ;
html = html.replace( re, "" ) ;
html = html.replace( /]*><\/SPAN>/gi, '' ) ;
html = html.replace(/<(\w[^>]*) lang=([^ |>]*)([^>]*)/gi, "<$1$3") ;
html = html.replace( /(.*?)<\/SPAN>/gi, '$1' ) ;
html = html.replace(/\s*<\/o:p>/g, "") ;
html = html.replace(/.*?<\/o:p>/g, " ") ;
html = html.replace( /\s*mso-[^:]+:[^;"]+;?/gi, "" ) ;
html = html.replace( /\*mso-[^:]+:[^;"]+;?/gi, "" ) ;
html = html.replace(//gi, '');
html = html.replace(//gi, '');
html = html.replace(//gi, '');
html = html.replace(/<(\/)*(\\?xml:|meta|link|span|font|del|ins|st1:|[ovwxp]:)((.|\s)*?)>/gi, '');
html = html.replace(/<(a){1}.*?>/i,'');
html = html.replace(//gi, ''); // Style tags
html = html.replace(/(class|style|type|start)=("(.*?)"|(\w*))/gi, ''); // Unwanted sttributes
alert("HTML after Final"+html);
//html = html.replace( /<[^<>]+>/g, ''); //remove all tags
And Word file which i am copying is having this content where the data is breaked into lines. In editor its showing in same line but when i am looking into alert box or in pdf its breaked into mulitple line.
So need your help..
Single sign on from Portal to Lotus is achieved through SAP Logon ticket, which is issued by SAP Portal and stored as browser cookie, which is accept by a lotus Domino(DSAPI Filter).
2. Implementation
The solution implements the approach which makes use of LtpaToken to SSO from EP to domino servers running on non-windows platform.
a) The DSAPI filter needs to be installed only on the Domino locator server and not
on each and every domino server in the landscape.
b) A single lotus transport needs to be created in the portal corresponding to the
Domino locator server, since locating the mail server of the user is handled
internally.
Domino Side Configuration
1. The landscape can have multiple domino servers out of which one has to be the Domino Locator server.
Hi Anthony,
I am facing one strange issue. I am pasting some text from word document to YUI Editor and i have used CleanPaste utility.In IE whatever the text i am pasting is replacing by the unformatted text but in firefox its duplicating the content which are inside the editor.
So need your help.
Regards
Kam
Thanks for your hard work with this, it is incredibly useful.
cleanpaste works fine without formatting,but once the cleaned content is inserted into rte (rich text editor) if u select the text and right click your mouse, the context is getting disappeard and is replaced by a character which is mentioned in execCommand(). It would be great if some one wil help me for the same. myid(charan.cse@gmail.com)
Post a Comment