fbpx
Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1

TOPIC: StringBuilder performance

StringBuilder performance 12 Mar 2020 17:05 #13712

  • wriedmann's Avatar

  • wriedmann

  • Topic Author


  • Posts: 2280
  • Hi all interested people,
    please see this code:
    cBuffer := DateTime.Now:ToString()
    foreach oTag as PlanTag in _oPlanTage
        	cBuffer := cBuffer + oTag:DebugString( 1 )
        	foreach oPosition as PlanPosition in _oPlanPositionen
        		cBuffer := cBuffer + oPosition:DebugString( 1 )
        	next
    next
    cBuffer := cBuffer + DateTime.Now:ToString()
    In my application this code creates a text file with over 95.000 lines.
    The code takes a lot of time (5 minutes 36 seconds) and uses an entire processor core.
    A simple optimization makes it behave better:
    cBuffer := DateTime.Now:ToString()
    foreach oTag as PlanTag in _oPlanTage
     	cBuffer := cBuffer + oTag:DebugString( 1 )
      	cPosition := ""
        	foreach oPosition as PlanPosition in _oPlanPositionen
        		cPosition := cPosition + oPosition:DebugString( 1 )
        	next
        	cBuffer := cBuffer + cPosition
    next
    cBuffer := cBuffer + DateTime.Now:ToString()
    The only change is that instead of adding every substring to the main buffer there is an intermediate buffer.
    This reduces the needed time to about 4 seconds!!!
    But the use of the StringBuilder class makes the code again perform faster:
    oSB := StringBuilder{}
    oSB:AppendLine( DateTime.Now:ToString() )
    foreach oTag as PlanTag in _oPlanTage
        	oSB:Append( oTag:DebugString( 1 ) )
        	foreach oPosition as PlanPosition in _oPlanPositionen
        		oSB:Append( oPosition:DebugString( 1 ) )
        	next
    next
    oSB:AppendLine( DateTime.Now:ToString() )
    cBuffer := oSB:ToString()
    The code now takes only 2 seconds!
    Wolfgang
    P.S. in VO you can see similar differences, but there is no StringBuilder class available.
    Wolfgang Riedmann
    Meran, South Tyrol, Italy

    www.riedmann.it - docs.xsharp.it

    Please Log in or Create an account to join the conversation.

    Last edit: by wriedmann.

    StringBuilder performance 12 Mar 2020 17:31 #13713

  • Chris's Avatar

  • Chris


  • Posts: 1945
  • Hi Wolfgang,

    Very good sample!

    Furthermore, if you know in advance the size (more or less) of the final string, then specify this in the constructor of the StringBuilder object, this will make sure that its internal buffer will only allocated once (instead of dozens of times if you do not specify a starting size), which will further improve performance.

    Also, if you do this very often in your app, then it's also a good idea to always (re)use a single StringBuilder object, instead of creating a new one every time. Just reset to zero string size after you are done with it (with oSB:Length := 0), this will keep the internal buffer intact, which will prevent any further memory allocation when you generate new text in the string builder. Only further memory allocation will happen when converting it to a normal string.
    XSharp Development Team
    chris(at)xsharp.eu

    Please Log in or Create an account to join the conversation.

    StringBuilder performance 12 Mar 2020 19:47 #13714

  • wriedmann's Avatar

  • wriedmann

  • Topic Author


  • Posts: 2280
  • Hi Chris,
    I had tried to build a StringBuilder class in VO, but unfortunately it was slower that a simple string concatenation as in the 2nd sample.
    This is the relative VO-Code:
    class StringBuilder
    protect _aElements			as array
    	
    declare method Append
    declare method GetString
    	
    method Init() class StringBuilder
    _aElements := {}
    return self
    	
    method Append( cString as string ) as void pascal class StringBuilder
    AAdd( _aElements, cString )
    return
    	
    method GetString() as string pascal class StringBuilder
    local ptrResult as byte ptr
    local ptrTemp as byte ptr
    local nLen as dword
    local nI as dword
    local nBufLen as dword 
    local nTotalLen as dword 
    local nIndex as dword
    local cBuffer as string
    local cResult as string
    	
    nLen := ALen( _aElements )
    nBufLen := 0
    for nI := 1 upto nLen         
      cBuffer := _aElements[nI]
      nTotalLen := nTotalLen + SLen( cBuffer )
    next
    if nTotalLen == 0
      cResult := ""
    else
      ptrResult := MemAlloc( nTotalLen )
      if ptrResult == null_ptr
        _Break( "memory allocation error - failed to allocate " + NTrim( nTotalLen ) + " bytes" )
      endif
      ptrTemp := ptrResult
      nIndex := 0
      for nI := 1 upto nLen         
        cBuffer := _aElements[nI]
        nBufLen := SLen( cBuffer )
        MemCopyString( ptrTemp, cBuffer, nBufLen )
        ptrTemp := ptrTemp + nBufLen
      next
      cResult := Mem2String( ptrResult, nTotalLen )
      MemFree( ptrResult ) 
    endif
    	
    return cResult
    I'm pretty sure this code can be enhanced, but after the first checks I decided to to put more time in this class.
    Wolfgang
    Wolfgang Riedmann
    Meran, South Tyrol, Italy

    www.riedmann.it - docs.xsharp.it

    Please Log in or Create an account to join the conversation.

    StringBuilder performance 12 Mar 2020 21:18 #13715

  • Jamal's Avatar

  • Jamal


  • Posts: 198
  • Hi Wolfgang,

    While you are at it, just wondering if you create X# or C# COM object and initialize a StringBuilder object like Chris suggested, then use it in a similar fashion, what would the performance be beyond the initial COM object call.

    Jamal

    Please Log in or Create an account to join the conversation.

    StringBuilder performance 13 Mar 2020 06:45 #13716

  • wriedmann's Avatar

  • wriedmann

  • Topic Author


  • Posts: 2280
  • Hi Jamal,
    I have not tested that, but in my experience (and I do a LOT of COM interaction between X# modules and VO applications) the COM interface is not very fast (and cannot be very fast because there is a lot of code and a lot of conversions involved).
    Wolfgang
    Wolfgang Riedmann
    Meran, South Tyrol, Italy

    www.riedmann.it - docs.xsharp.it

    Please Log in or Create an account to join the conversation.

    StringBuilder performance 13 Mar 2020 09:30 #13717

  • Serggio's Avatar

  • Serggio


  • Posts: 24
  • You're welcome (see the attachment)
    Attachments:

    Please Log in or Create an account to join the conversation.

    Last edit: by Serggio.

    StringBuilder performance 14 Mar 2020 09:42 #13722

  • Karl-Heinz's Avatar

  • Karl-Heinz


  • Posts: 574
  • wriedmann wrote: Hi Chris,
    I had tried to build a StringBuilder class in VO, but unfortunately it was slower that a simple string concatenation as in the 2nd sample.

    Hi Wolfgang,

    i agree, even when i use static memory only i see no speed advantages. Maybe i overlooked something, but when i compare the results of your stringbuilder with mine the speed differences are not that much as i would expect.
    CLASS StringbuilderMem  INHERIT Vobject
    PROTECT _ptrValue  AS BYTE PTR
    PROTECT _dwCurrentPos AS DWORD
    PROTECT _dwStep := 2000 AS DWORD 
    
    DECLARE METHOD Append
    DECLARE METHOD GetString 
    
    METHOD Append ( cValue AS STRING )  AS VOID PASCAL CLASS StringbuilderMem  
    LOCAL dwLen AS DWORD             
    
          
    	dwLen := SLen ( cValue ) 
    		
    
    	IF dwLen > 0  
          	
    				
    		IF _dwCurrentPos + dwLen  > MemLen ( _ptrValue ) 
    					
    //		  ? "MemRealloc"  ,  MemLen ( _ptrValue )  , dwLen , _dwCurrentPos  
    					
    			_ptrValue := MemRealloc ( _ptrValue , MemLen (_ptrValue ) + _dwStep )   
    					
    		ENDIF 	
    				
    
      	   MemCopyString ( PTR ( _CAST , _ptrValue  + _dwCurrentPos )  , cValue , dwLen )  
    
         	_dwCurrentPos += dwLen
    	     	
    	ENDIF	     	
    
    
    	RETURN 
    METHOD Destroy() CLASS StringbuilderMem 
    
    	
    	UnRegisterAxit(SELF) 
       	
    	IF _ptrValue != NULL_PTR 
    		MemFree ( _ptrValue ) 		
    		
    	ENDIF 	
    
    	RETURN NIL 
    
    
    METHOD GetString() AS STRING PASCAL CLASS StringbuilderMem 
     
    	IF  _ptrValue == NULL_PTR  .OR. _dwCurrentPos == 0
    		RETURN NULL_STRING
    		
    	ENDIF			
    		
    	RETURN Mem2String ( _ptrValue , _dwCurrentPos ) 
    
    	                      
    
    METHOD Init( nCapacity ) CLASS StringbuilderMem  
    
    
    	Default (@nCapacity, _dwStep )   
    	
    
    	_ptrValue := MemAlloc ( nCapacity )
    		
    	_dwStep := nCapacity
    	 
      	RegisterAxit ( SELF )
      	           
    
    	RETURN SELF  
    

    regards
    Karl-Heinz

    Please Log in or Create an account to join the conversation.

    StringBuilder performance 14 Mar 2020 16:50 #13725

  • ArneOrtlinghaus's Avatar

  • ArneOrtlinghaus


  • Posts: 184
  • I have also made the experience that often repeated string operations with strings for 1000 characters and more get very expensive. In VO already many years ago I made a class similar to stringbuilder to use memalloc functions for avoiding triggering the garbage collector and there was a huge difference in speed. Now with X# it is very similar: the dynamic memory can get cost intensive. Making tests with a performance profiler show that much time goes into treating strings, even if fully strong typed.

    Please Log in or Create an account to join the conversation.

    StringBuilder performance 16 Mar 2020 16:23 #13738

  • mainhatten's Avatar

  • mainhatten


  • Posts: 139
  • wriedmann wrote: I have not tested that, but in my experience (and I do a LOT of COM interaction between X# modules and VO applications) the COM interface is not very fast (and cannot be very fast because there is a lot of code and a lot of conversions involved).

    Hi Wolfgang,
    gut reaction hints at following my second programming mantra: "Chunky, not Chatty" when it comes to calling across layers, as such layers sometimes have realistc physical borders - in this case the marshalling code. I am pretty certain that your first example done across COM into Stringbuilder, would be slower - at least at first / for strings not really large. The second example, first concatenating lots of tiny strings into intermediate, then doing 1 large append - there the benefit of not tasking memory managment with large discarded memory areas might be better as target string is in multi-megabyte range.

    In vfp we have similar issues, typically when string sizes rize above 10K and memory allotment is set for small VM. Typical response is similar to your second way (as we have no StringBuilder type), although often with the twist of not only adding small strings into 1 string, but a small array of strings, which then can be concatenated in 1 line
    laTmp = ""   && setting all elements to 1 start value is nice in this context
    for lnRun = 1 to 7
        *-- build laTmps
    next
    lcLargeString = lcLargeString + laTmp[1] + laTmp[2] + laTmp[3] + laTmp[4] + laTmp[5] + laTmp[6] + laTmp[7] 
    as the slow part is not the concat of one or more strings, but the release of previous var, claiming new memory and assigning the total of right side of the line. Can be seen by measuring: as lcLargestring grows, adding strings of identical length gets slower as lcLargeString grows.
    But easiest way (even if going against "RAM is always faster" reflex) is to open a buffered low level file and just appending the new strings until result is finished. If needed loading them once with FileToStr() for further processing is often faster than always memcpying it around in process space, as all internal memory allotment and garbage collection is sidestepped until final load.
    Unixoid behaviour makes sense there and is even easier to code and read. (Noticed that xSharp does not differentiate between buffered or unbufferef LLF, but as buffered was/is vfp default behaviour, probably xSharp LLF implementation defaults to buffered as well. Question already raised on GIT)

    That was true on old HD, and SSD improved write throughput as well.

    regards
    thomas

    Please Log in or Create an account to join the conversation.

    • Page:
    • 1